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Abstract 

We prove a strong limitation on the ability of entangled provers to collude in a multiplayer 
game. Our main result is the first nontrivial lower bound on the class MIP* of languages having 
multi-prover interactive proofs with entangled provers; namely MIP* contains NEXP, the class 
of languages decidable in non-deterministic exponential time. While Babai, Fortnow, and Lund 
(Computational Complexity 1991) proved the celebrated equality MIP = NEXP in the absence 
of entanglement, ever since the introduction of the class MIP* it was open whether shared 
entanglement between the provers could weaken or strengthen the computational power of 
multi-prover interactive proofs. Our result shows that it does not weaken their computational 
power: MIP C MIP*. 

At the heart of our result is a proof that Babai, Fortnow, and Lund's multilinearity test is 
sound even in the presence of entanglement between the provers, and our analysis of this test 
could be of independent interest. As a byproduct we show that the correlations produced 
by any entangled strategy which succeeds in the multilinearity test with high probability can 
always be closely approximated using shared randomness alone. 

1 Introduction 

Multiprover interactive proof systems [BGKW88] are at the heart of much of the recent history 
of complexity theory, and the celebrated characterization MIP = NEXP [BFL91] is one of the cor- 
nerstones on which the PCP theorem [AS98, ALMSS98] was built. While the key assumption on 
the multiple provers in an interactive proof system is that they are not allowed to communicate, 
traditionally this has been taken to mean that their only distributed resource was shared random- 
ness. In a quantum universe, however, it is natural to relax this assumption and allow the provers 
to share entanglement. While still not allowing them to communicate, this increases their ability 
to collude against the verifier by exploiting the nonlocal correlations allowed by entanglement. 
The corresponding complexity class MIP* was introduced in [CHTW04], raising a fundamental 
question: what is the computational complexity of entangled provers? 
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Even before their modern re-formulation in the language of multiplayer games, starting with 
the work of Bell in the 1960s [Bel64] the strength of the nonlocal correlations that could be ob- 
tained from performing local measurements on entangled particles has been intensely investi- 
gated through the use of Bell inequalities (upper bounds on the strength of classical correlations) 
and Tsirelson inequalities (upper bounds on the strength of quantum correlations). Games, or proof 
systems, generalize this setup by introducing an additional layer of interaction: in this new context, 
we think of the experimenter (the verifier) as interacting with the physical devices (the provers) 
through the specific choice of settings (questions) that he makes, and the outcomes (answers) that 
he observes. The arbitrary state and measurements that are actually made inside the devices are 
reflected in the provers' freedom in choosing their strategy. The fundamental observation that 
quantum mechanics violates certain Bell inequalities translates into the fact that there exists in- 
teractive proof systems in which entangled provers can have a strictly higher success probability 
than could any classical, non-entangled provers. 

A dramatic demonstration of this possibility is given by the Magic Square game [Mer90, Per90], 
a simple one-round game for which the maximum success probability of classical provers is 8/9, 
but there exists a perfect winning strategy for entangled provers. Cleve, Hoyer, Toner, and Wa- 
trous [CHTW04] were the first to draw complexity-theoretic consequences from such non-local 
properties of entanglement. They study the class ©MIP of languages having two-prover interac- 
tive proofs in which there is a single round of interaction, each of the provers is restricted to an- 
swering a single bit, and the verifier only bases his accept/reject decision on the parity of the two 
bits that he received. While it follows from work of Hastad [HasOl] that this class equals NEXP 
(and is thus as powerful as the whole of MIP) for an appropriate setting of completeness and 
soundness parameters, Cleve et al. show that the corresponding entangled-prover class ©MIP* 
collapses to EXP for any choice of completeness and soundness parameters that are separated by 
an inverse polynomial gap. 1 

Despite intense efforts, for a long time little more was known, and prior to our work the best 
lower bound on MIP* resulted from the trivial observation that multiple entangled provers are 
at least as powerful as a single prover, hence IP = PSPACE C MIP*, where the first equality is 
due to [LFKN92, Sha92]. 2 The main difficulty in improving this trivial lower bound is the fol- 
lowing: while the PCP theorem gives us a variety of two-prover interactive proof systems for 
NEXP-complete problems, there is no a priori reason (see e.g. the Magic Square game, which has 
a similar structure to that of basic proof systems for MAX-3-XOR, or the aforementioned collapse 
of ©MIP*) that they should remain sound in the presence of entanglement. Indeed, if one con- 
siders provers allowed to reproduce any distribution that is no-signaling, 3 then it follows from a 
linear-programming formulation 4 of the problem that the corresponding class MIP ns C EXP — 
in fact, for the case of two-prover single-round proof systems it was even shown in [ItolO] that 
MIP ns (2, 1) = PSPACE. Entanglement, however, does not allow the provers to reproduce the full 
set of no-signaling strategies, and these results leave the complexity of the class MIP* completely 
open. 

1 This was later improved [Weh06] to the inclusion of ©MIP* in the class of two-message single-prover interactive 
proofs QIP(2) C PSPACE [JUW09]. 

2 It was recently shown that quantum messages are no more powerful than classical messages in single-prover inter- 
active proof systems [JJUW11]: QIP = PSPACE. That result, however, has no direct relationship with our work: in 
our setting the messages remain classical; rather the "quantumness" manifests itself in the presence of entanglement 
between the provers, which is a notion that only arises when more than one prover is present. 

3 A collection of distributions on the provers' answers, one for every tuple of questions, is no-signaling if, for any 
such distribution, its marginal on any subset of the provers is independent of the questions to the remaining provers. 

4 This formulation was first observed by Daniel Preda. 
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The fact that entanglement, as a shared resource, is poorly understood is also reflected in the 
complete absence of reasonable upper bounds on the complexity class MIP*: while the inclusion 
MIP C NEXP is straightforward, we do not know of any limits on the dimension of entanglement 
that may be useful to the provers in a given interactive proof system, and as a result their maxi- 
mum success probability is not even known to be computable (see [SW08, DLTW08, NPA08] for 
more on this aspect). 

Since existing protocols may no longer be sound in the presence of entanglement between 
the provers, previous work has focused on finding ways to modify a given protocol in a way that 
would make it entanglement resistant; that is, honest provers (in the case of a YES-instance) can 
convince the verifier without shared entanglement while dishonest provers (in the case of a NO- 
instance) cannot convince the verifier with high probability even with shared entanglement. This 
was the route taken in [KKMTV11, IKPSY08, IKM09], which introduced techniques to limit the 
provers' use of their entanglement. They proved non-trivial lower bounds on variants of the 
class MIP*, but with error bounds that are weaker than the standard definitions allow for. These 
relatively weak bounds came as a result of the "rounding" technique developed in these works: 
by adding additional constraints to the protocol, one ensures that optimal entangled strategies are 
in a sense close to classical, un-entangled strategies. This closeness, however, was shown using 
a rounding procedure that had a certain "local" flavor, inducing a large loss in the quality of the 
approximation. 5 

In addition, [IKM09], based on [KKMTV11], showed that PSPACE has two-prover one-round 
interactive proofs with entangled provers, with perfect completeness and exponentially small 
soundness error. Prior to our work, this was the best lower bound known on single-round multi- 
prover interactive proof systems with entanglement. 

Additional related work. Given the apparent difficulty of proving good lower bounds on the 
power of multi-prover interactive proof systems with entangled provers, researchers have stud- 
ied a variety of related models. Maybe the most natural extension of MIP* consists in giving the 
verifier more power by allowing him to run in quantum polynomial-time, and exchange quan- 
tum messages with the provers. The resulting class is called QMIP* (the Q stands for "quantum 
verifier", while the * stands for "entangled provers"), and it was formally introduced in [KM03], 
where it was shown that QMIP* contains MIP* (indeed, the verifier can always force classical 
communication by systematically measuring the provers' answers in the computational basis). 
Recently Reichardt et al. [RUV12] showed that QMIP* = MIP* (the possibility of which had been 
suggested earlier in [BFK10]). Ben-Or et al. [BHP08] introduced a model in which the verifier is 
quantum and the provers are allowed communication but no entanglement, and showed that the 
resulting class contains NEXP. Other works attempt to characterize the power of MIP* systems 
using tensor norms [RT07, JPPVW10]; so far however such norms have either led to computable, 
but very imprecise, approximations, or have remained (to the best of our knowledge) intractable. 

1.1 Results 

Let MIP* (k, m, c, s) be the class of languages that can be decided by an ra-round interactive proof 
system with k (possibly entangled) provers and with completeness c and soundness error s. 6 Our 
main result is the following. 



5 See the "almost-commuting implies nearly-commuting" conjecture in [KKMTV11] for more on this aspect. 
6 We refer to Section 2.2 for a more complete definition of the class MIP*. 
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Theorem 1. All languages in NEXP have a three-prover poly-round interactive proof system with perfect 
completeness and exponentially small soundness error against entangled provers. That is, for every q E 
poly, it holds that 

NEXP C MIP* (3, poly, 1,2^). 

Theorem 1 resolves a long-standing open question [KM03], showing that entanglement does 
not weaken the power of multi-prover interactive proof systems: together with the inclusion 
MIP C NEXP, it implies that MIP C MIP* . We note that the proof system in Theorem 1 does not re- 
quire honest provers to use any entanglement in order to achieve perfect completeness in the case 
of a YES-instance. In other words, if we denote by MIP er the class of languages having entangle- 
ment resistant multi-prover interactive proof systems with bounded error, our proof of Theorem 1 
shows that NEXP C MIP er . Because MIP er C MIP by definition, this implies MIP er = NEXP. 

The interactive proof system used in the proof of Theorem 1 uses three provers and a polyno- 
mial number of rounds of interaction. We do not know if the number of provers can be reduced; 
however if one is willing to increase it by one then the amount of interaction required can be re- 
duced to a single round, i.e. one message from the verifier to each prover, and one message from 
each prover to the verifier. Indeed, our proof system has the additional property of being non- 
adaptive: the verifier can select his questions for all the rounds before interacting with any of the 
provers. It is shown in [Itoll] that a non-adaptive entanglement-resistant protocol may be paral- 
lelized to a single round of interaction at the cost of adding an extra prover. Applying this result 
to Theorem 1 gives the following corollary. 

Corollary 2. All languages in NEXP have afour-prover one-round interactive proof system with perfect 
completeness and soundness error against entangled provers hounded away from 1 by an inverse polyno- 
mial, that is: 

NEXP C MIP* (4, 1,1,1-1/ poly) . 

Prior results on the complexity of multi-prover interactive proofs with entangled provers have 
often been stated using the languages of games [CHTW04, KKMTV11, KRT10]. The main differ- 
ence, in terms of computational complexity, is in the way the input size is measured. In the case of 
games the input is an explicit description of the game, including a list of all possible questions and 
valid answers, while in the setting of proof systems the messages may be described implicitly: it 
is their length that is polynomial in the input size. 

Because of this difference in scaling, our results do not immediately imply any NP-hardness 
result in the setting of multi-player games with entangled players. Nevertheless, by adapting the 
proof of Theorem 1 and using the PCP theorem one can show the following. There is a constant 
k > 1 and a procedure that, given as input an arbitrary 3-SAT formula with n variables and 
m = poly(n) clauses, runs in time 2°( lo §' c ") and produces an explicit description of a three-player 
game of size S = 2°( lo s ") (i.e. the number of rounds of interaction and the total number of 
questions and answers that can be sent and received is at most S). The game has the property that, 
if the 3-SAT formula was satisfiable, then there is a perfect strategy for the players, which does 
not require any entanglement. If, however, the 3-SAT formula was not satisfiable, then there is 
no strategy for the players, even using entanglement, that succeeds with probability greater than 
1/2. 

If one could show the above with constant k — 1 then it would follow that finding a constant- 
factor approximation to the maximum success probability of three entangled players in a game 
with polynomially many rounds and questions is NP-hard; our result is limited to obtaining some 
possibly large k > 1. The main point, however, is that the hardness of approximation is up 
to constant factors. This is in contrast to all previous results which were limited to hardness of 
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approximation up to factors approaching 1 very quickly as the input size grew (even after arbitrary 
sequential or even parallel repetition). 7 

At the heart of the proof of Theorem 1 is a soundness analysis of Babai, Fortnow, and Lund's 
multilinearity test in the presence of entanglement between the provers: we show that it is in a sense 
"immune" to the strong non-local correlations that entangled provers may in general afford. We 
believe that this analysis should be of wider interest, and we explain the test and the main ideas 
behind its analysis in the presence of entanglement in Section 1.3 below. We first briefly outline 
the overall structure of our proof system in Section 1.2. It is very similar to the one introduced by 
Babai, Fortnow, and Lund [BFL91] to prove NEXP C MIP; our contribution consists in proving its 
soundness against entangled provers. 

1.2 Proof outline 

Our interactive proof system verifies membership in a specific NEXP-complete language, succinct 
3-colorability (see Problems 1 and 2 in Section 2.3 for a definition). We give a three-prover, poly- 
round interactive protocol for it that has perfect completeness and soundness error bounded away 
from 1 by an inverse-polynomial in the input size. (Theorem 1 is obtained by sequentially repeat- 
ing this interactive proof system.) We emphasize that the proof system we use is not new, as it is 
essentially the same as the one introduced in [BFL91]. We nevertheless outline it because there is 
a small difference in how the "oracle" in [BFL91] is simulated by provers, which is the reason our 
protocol, unlike the one in [BFL91], requires more than two provers. 

Simplifying a little bit (we refer the reader to Section 3 for details), the verifier in our protocol is 
given as input two integers n, p in unary (think of p as much larger than n, but still polynomial), a 
description of a finite field F of size p, and a low-degree polynomial / : (F" ) 2 x (F) 2 —> F. His goal 
is to verify whether there exists a multilinear function g : F" — > F such that f(x, y, g{x),g(y) ) = 
for all x, y G {0, 1}" C F". If this is the case then the input is a YES-instance, whereas if for all func- 
tions g that are "close" to multilinear functions at least one of the constraints f(x, y, g{x),g{y) ) — 
is not satisfied then it is a NO-instance. The difficulty, of course, is that there are exponentially 
many constraints to verify, and all must be satisfied for the instance to be a YES-instance. 

The protocol is divided into two distinct parts, which only weakly interact with each other. 
In the first part of the protocol, the verifier performs a polynomial-round low-degree sum-check test 
with a single prover, say the last prover (see Lemma 9 for an explicit formulation). This test is 
based on ideas already introduced by Lund, Fortnow, Karloff, and Nisan [LFKN92] and can be 
used to verify that a low-degree function defined over W k vanishes on all of {0, 1} . We will apply 
it to the low-degree function h : (F n ) 2 — > F defined by h(x,y) = f{x,y,g{x),g{y)). An important 
point for us is that, in the LFKN protocol, the verifier eventually only needs to evaluate h at a 
single point (x, y) G (F") 2 chosen uniformly at random. Of course, the verifier only knows /, not 
g, and therefore the verifier asks the two remaining provers the values g(x) and g{y). 

However, note that here the function g is arbitrary (we are trying to verify its existence), except 
that it has to be multilinear. The goal of the second part of the protocol is to ensure that it is indeed 
chosen according to some multilinear function. Therefore, the verifier will sometimes perform 
a certain "multilinearity test" with the three provers, which enforces that, however the provers 
answer their queries, it must be according to a function that is close to a multilinear function. The 
two tests will be indistinguishable from the point of view of the provers because the marginal 
distribution on the question to each prover is uniform over F" in both cases. 

7 Cleve, Gavinsky, and Jain [CGJ09] obtained a constant-factor hardness result for games with constant answer size, 
but in which the number of questions sent by the verifier is exponential. 
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Completeness of the protocol is easy to verify, and in the case of a YES-instance honest provers 
do not need any entanglement to be accepted with probability 1. To prove soundness, assum- 
ing four entangled provers succeed with probability that is polynomially close to 1, we wish to 
conclude that the instance given as input to the verifier must be a YES-instance. 

Note that provers successful in the overall protocol must, in particular, succeed with high prob- 
ability in the multilinearity test. The key step in the analysis consists in showing the following: 
Any three entangled provers that succeed in the multilinearity test with high probability are "in- 
distinguishable" from classical provers who use shared randomness to jointly sample a multilinear 
function g, and then answer question x with g(x) . This step is the one that requires the most work, 
and we explain it in more detail in the next section. (In particular, we will clarify what is meant 
by "indistinguishable".) 

Assuming this informal statement holds, it is not too hard to conclude the analysis of the 
protocol. Indeed, having replaced two out of the three provers by classical provers, there is only 
a single "quantum" prover left, the one used to perform the sum-check test in the first part of the 
protocol. But entanglement cannot be useful to a single prover, and hence we may also assume 
that this last prover behaves classically. Since all provers are now classical, we have reduced our 
analysis to the classical setting and can appeal to the results in [BFL91] to conclude. We refer to 
Section 3 for a more detailed presentation and soundness analysis of the protocol. 

1.3 The multilinearity game 

The key step in the proof of Theorem 1 is the analysis of the multilinearity test of [BFL91], which 
generalizes the celebrated linearity test of Blum, Luby, and Rubinfeld [BLR93] and is essential 
in constructing a protocol for NEXP that has messages of polynomial length. 8 The test can be 
formulated as a game played between the verifier and three players. The game is parametrized 
by a finite field IF and an integer n. In the game, the verifier performs either of the following with 
probability 1/2 each: 

• Consistency test. The verifier chooses iGF" uniformly at random and sends the same ques- 
tion x to all three players. He expects each of them to answer with an element of F, and 
accepts if and only if all the answers are equal. 

• Linearity test. The verifier chooses i G {1, . . . , n}, x G F n and y,, z, G F uniformly at random, 
and sets yy = Zy = Xy for every ; G {1, . . . , n} \ {/}. He sends x, y, z to the three players, 
receives a, b, c G F, and accepts if and only if 

b — a c — b c — a 

yi %{ Z{ yi Z{ X{ 

Babai, Fortnow, and Lund show that, if any three deterministic players are accepted by the 
verifier with probability at least 1 — e in this game, then the functions they each apply to their 
questions in order to determine their respective answers are close to a single multilinear function 
g : F n — > F (see Theorem 4.16 in [BFL91] for an analysis of a variant of the test over the integers). 
That is, for all but at most a fraction roughly 0(n 2 e) (provided |F| is large enough) of x G F", the 
players' answer to question x is precisely g{x). 

8 One can devise a protocol based on the linearity test alone, but it requires the verifier to send messages with 
exponential length to the provers. Such use of the linearity test was already key in establishing the early result NP C 
PCP(poly, 1); see e.g. Theorem 2.1.10 in [ALMSS98]. 
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A major hurdle in proving a similar statement in case the players are allowed to use quantum 
mechanics already arises in formulating the statement to be proven: even in the case of players 
restricting their use of entanglement as shared randomness, what meaning should one ascribe to 
their strategies being "close to multilinear"? Indeed, it could be that the answer of each player to a 
fixed question, when taken in isolation, is uniformly random: the whole substance of the strategy 
is in the correlations between the answers of different players. This difficulty is usually set aside 
by "fixing the randomness". Entanglement, however, cannot be "fixed", and this forces us to face 
even the presumably simpler case of randomized strategies head on. We show that the following 
is an appropriate formulation of Babai et al.'s multilinearity test in the general setting of entangled 
(or even just randomized) players (see Theorem 11 for a precise statement). 

Theorem 3 (Informal). Suppose that three entangled players who share a permutation-invariant state | Y) 
succeed in the multilinearity game with probability 1 — e where each player uses the set of measurements 
{A£} flGF to determine his answer to the verifier's question x G F". 

Then there exists a single measurement {V s }, independent of any question and with outcomes in the 
set of all multilinear functions g : F" — > F, such that, in the multilinearity game, each player's action is 
indistinguishable from that of player whom, upon receiving his question x, would 

1. Measure his share of |Y) with {V s }, obtaining a multilinear function g as an outcome, 

2. Answer his question x with g(x). 

Moreover, the multilinear functions used by the three players are identical with high probability. 

In case the players are classical, but may use shared randomness, the theorem makes the fol- 
lowing simple statement: players successful in the multilinearity game are "indistinguishable" 
from players who would first look up their random string, based on that alone select a multilinear 
function g, and finally answer their respective questions X[ with g(x,-). While such a statement 
is a direct corollary of Babai, Fortnow, and Lund's analysis, our contribution is to prove it with- 
out first "fixing the randomness" — and to show that it also holds for the case of players using 
entanglement. 

An appropriate notion of distance on entangled-prover strategies. Crucial to the applicability 
of Theorem 3 is the precise notion of "indistinguishability" used. Indeed, while there is no hope 
of making statements on the players' measurements or their shared entangled state themselves 
(since the verifier has no direct access to them throughout the protocol), one still needs to use a 
notion that is strong enough to be meaningful even when the multilinearity game is executed as a 
building block in the larger protocol explained in the previous section. 

The measure we use is based on the notion of consistency between two measurements, and 
it may be useful to introduce it here in a simplified setting (precise definitions are given in Sec- 
tion 2.1). Let {A*}i G j and {£>'} !G j be two quantum measurements of the same dimension and 
indexed by the same set of outcomes: A 1 , B 1 > for all i G I, and Ya A 1 = Ya B 1 = Id. Let |Y) be 
a bipartite state that is invariant under permutation of its two subsystems, and p its reduced state 
on either. We say that A and B are ^-consistent if the following holds: 

CON(A,B) := £(Y|A , '®B , '|Y) > 1 - e. (1) 

i 

This definition has an operational interpretation: the two measurements A and B, when per- 
formed on the two subsystems of |Y), give the same outcome except with probability e . The 



7 



key fact about consistent measurements is the following. Suppose that A and A, B and £>, and A 
and B are all e-consistent. Then A and B are indistinguishable in the sense that 



This last expression corresponds to a more familiar notion of closeness of two measurements: they 
are close if the post-measurement states resulting from applying either are close in trace distance. 
The fact that (1) essentially implies (2) relies on Winter's "gentle measurement" lemma [Win99, 
Lemma 9] (see also Aaronson's "almost as good as new" lemma [Aar05, Lemma 2.2]), a key tool 
in our analysis. 

In this paper we will consider two measurements to be close whenever they are consistent, 
having the assurance that this notion of closeness implies the more traditional one expressed 
by (2). In particular, it is not hard to verify that (2) implies that either measurement may be 
"replaced" by the other even in a wider context; see the proof of Claim 12 in Section 3 for more 
details on how this can be done. The advantage of using this measure is that constraints on the 
consistency of measurements arise naturally from the analysis of the multilinearity game, and it 
is a notion that is very convenient to work with. 

Analysis of the multilinearity game: rounding entangled strategies. Theorem 3 states that suc- 
cess in the multilinearity game forces even entangled players to make a trivial use of their entan- 
glement: since the measurement { V s } is independent of their respective questions, they might as 
well perform it before the game starts, in which case they are not using their entanglement at all. 
Hence the theorem implies that entangled players are no more powerful than classical players in 
that game. A key insight of our work, however, is to avoid any attempt to prove such a statement 
directly. Instead, our proof technique consists in progressively manipulating the players' strategies 
themselves, without explicitly trying to relate them to a classical strategy. 

Our goal is to show how the measurement { V g } can be extracted from the initial set of mea- 
surements {A x } which depend on x G F". 9 More precisely, we show how, starting from the 
original measurements one may remove the dependence of {A a x } on x G F" one coordinate 

at a time — eventually reaching the measurement {V s }. Towards this we construct a sequence of 
measurements {B Xk+1/ ... /Xn } , for k = 1, . . . , n, with outcomes g in the set of multilinear functions 

F k — > F. Each of these measurements has the following key property: the respective strategies 
corresponding to (i) measuring according to {A x } and answering a or (ii) measuring according 
to {Bx k+li ,..,x n } and answering g(x\,. ..,x^) are consistent, in the sense described in Eq. (1): two 
distinct players using either strategy will obtain the same answer with high probability (provided 
they started with the same question). 

This sequence of measurements is defined by induction, and we only explain the one-dimensional 
case here. Our construction is intuitive: {£> c? } corresponds to measuring using {A x } twice, in suc- 
cession, using two randomly chosen values of X\, and returning the unique linear function g which 
interpolates between the two outcomes obtained. This can be interpreted as a quantum analogue 
of the reconstruction procedure already used in the linearity test of Blum, Luby and Rubinfeld: to 
recover a linear function it suffices to evaluate it at two random points, and then interpolate. The 
construction of the measurements {Bx k+1 ,..., x „} for the one dimensional case is given in Claim 15, 
and in the general case in Lemma 18, which states a quantum analogue of Babai et al.'s "pasting 
lemma" [BFL91, Lemma 5.11]. 

9 While we do give an explicit, inductive algorithmic procedure showing how {V g } can be constructed, this is not 
necessary: the point is only in proving its existence. 




(2) 
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An additional hurdle arises as a result of the induction: the quality of the approximation be- 
tween the original measurements {A a x } and the constructed measurements {Bx k+li ..., Xn } blows up 
exponentially with k. In order to control this error, one has to perform an additional step of self- 
improvement. This step was a key innovation in the work of Babai, Fortnow, and Lund, and ex- 
tending it to the setting of entangled strategies requires substantially more work. While for the 
case of deterministic strategies Babai et al. were able to show, using the expansion properties of 
the hypercube, that any "reasonably good" fc-linear approximation g at any point in the induction 
was automatically "extremely good", in our case we need to actively update the measurements 
through a self-correction procedure, obtaining the "improved" measurements as the optimum of 
a certain convex optimization problem. The need for such active correction is not a limitation of 
our approach, but rather reflects a fundamental difference between the quantum and the classical, 
deterministic settings: while two binary-valued functions either fully agree or fully disagree at 
any point, two quantum measurements can produce outcomes according to distinct but arbitrar- 
ily close distributions (think of one of the measurements as being obtained from the other by a 
small perturbation, such as an arbitrarily small rotation). It is this kind of "error" that needs to be 
corrected, and we explain our method to do so in more detail in Section 5.1. 

1.4 Discussion and open questions 

Improving the parameters in Theorem 1 and Corollary 2 is an open problem. For example, it 
might be possible to reduce the number of provers to two, and the number of rounds of in- 
teraction to one, while still preserving exponentially small soundness error, resulting in the in- 
clusion NEXP C MIP*(2, 1, 1, 2~i) for every polynomial q. This would be an analogue of the 
known containment NEXP C MIP(2, 1, 1,2 _< ?) [FL92]. Our overall protocol for NEXP requires 
three provers, and four provers if we would like to parallelize it by using [Itoll]. We leave the 
problem of reducing the number of provers for future work. It may also be possible to improve 
the soundness guarantees in Corollary 2 by using the parallel repetition techniques from [KV11], 
but we have not explored this possibility. 

In comparison to the PCP theorem, there are important parameters which are not explicit in 
Theorem 1 and Corollary 2: the amount of randomness used by the verifier and the total answer 
length. In our constructions, both of them are just bounded by a polynomial in the input length 
for NEXP, and they are poly-logarithmic for the scaled-down version corresponding to verification 
of languages in NP. If these numbers are respectively reduced to a logarithm and a constant 
for NP with a constant soundness, the result will be an analogue of the PCP theorem in presence 
of entanglement. Obtaining such a result may require extending our analysis of the multilinearity 
test to the more powerful low-degree tests that were key to establishing the "scaled-down" version 
of the PCP theorem. 

Honest provers in our protocol do not need entanglement in order to achieve completeness 1 
in the case of a YES-instance. It remains open whether entanglement can have any positive use in 
this context: is MIP* strictly larger than MIP = NEXP? 

Organization of the paper. After giving some necessary preliminaries, Section 3 describes the 
protocol used to prove Theorem 1, and shows how the theorem follows from a claim about the 
multilinearity game in the presence of entangled provers. Section 4 introduces a more technical 
claim about the analysis of the multilinearity game, which is suitable to a proof by induction on 
the number n of variables in the verifier's questions in the game. The actual analysis is given in 
Section 5. 
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2 Preliminaries 

In the remainder of the paper we assume that the reader is familiar with computational complexity 
theory [G0IO8, AB09], as well as with basic notions in quantum information [NC01, KSV02] such 
as density matrices, POVM measurements, quantum channels, and the trace distance. For more 
on quantum computational complexity we refer the reader to a recent survey by Watrous [Wat09]. 

2.1 Notation 

For a field F, a linear function g : F — > F is a function such that there exists «,6eF, g(x) = ax + b. 
A multilinear function g: F k — > F is a function that is linear in each of its coordinates. ML(F fc ,F) 
will denote the set of all multilinear functions from F k to F. We will denote tuples using bold 
symbols such as x and b. Given a tuple x = {x\, . . . , x n ) and k G [«], we let x<k := (x\, . . . , x^), 

x >k '■= ( x k+l>- ■ -i x n) and X^k := {X\,. . . , Xfc-l, x k+l, • • -i x n)- 

Given a positive matrix p and an arbitrary matrix A, we let Tr p (A) := Tr(Ap). In case p is a 
matrix on the tensor product of two Hilbert spaces H and %' , and A is a matrix acting on %, we 
will sometimes abuse notation and write Tr p (A) for Tr p (A <8> Id%'). If |Y) G "H® <8> %' is a state 
that is invariant under permutation of the first k registers, we will often abuse notation further 
and use the symbol p to denote the reduced density of | Y) on either of the first k registers, or even 
any pair of registers among the first k, etc. Hence any expression of the form Tr p (A £g> B) should 
really be read as 

(Y\A ® B <8> ld n <g> • • • <g> Id n (2) Id w |Y), 

where the position of A and B among the first k registers is immaterial by permutation-invariance. 
For any p > 0, we let 

\\A\\ 2 p := Tr(AAV), 

and observe that A4 |j A \\ p is a semi-norm (it is definite if p is invertible). It satisfies the following 
Cauchy-Schwarz inequality: for any A, B, 

Tr p (AB + ) < HAIUBH,. 

Measurements. In this paper, a measurement is a collection of non-negative matrices {P a } a ^A 
such that Y^ a P a = ^ (this is usually called a Positive Operator-Valued Measure, or POVM). The set 
A is the set of outcomes of the measurement; outcomes will always appear as superscripts. The 
measurement is said projective if P a is a projector, i.e. (P a ) 2 = P a , for every a. A sub-measurement 
is a collection of non-negative matrices {P a } a <=A such that YLa P a < Id. For integers < k < n we 
will also consider families of sub-measurements, indexed by x G W n ~ k and with outcomes in the 
set ML(F fc ,F). Such a family P = {Pi >k } will be called a family of sub-measurements of arity k (the 
parameter n will often be left implicit). A family of sub-measurements of arity n is thus a single 
sub-measurement with outcomes in ML(F",F). Given a family of sub-measurements P = {P% >k \ 
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of arity k, we will often use the notation 

g 

for any k < £ < n, where the expectation is taken with respect to the uniform distribution on ~F e ~ k . 
Given two families of sub-measurements P and Q with arities k < £ respectively, we define their 

consistency 

CON(P, Q) := E xGFn £ Tr p (p4* ® QL)' 

where g\ Xk f _ 1 is the (n — £) -linear function obtained by restricting g's {£ — k) first variables to 
and their inconsistency 

INC(P,Q) := E XGF „ ^ Tr p (p4, <8> QlJ, 

f'g'-g\x k ... f _ 1 ^f 

where p is a density matrix which will always be clear from the context. If k > £ then we define 
CON(P, Q) := CON(Q,P) andlNC(P,Q) := INC (Q,P). We will also use shorthands CON (P) = 
CON(P, P) and INC(P) = INC(P,P). Note that if P is a complete family of measurements, i.e. 
YLf Px >k = Id for every x>^, then 

CON(P,Q)+lNC(P,Q) = E x ^Tr p (Qf >f ) = Tr p (Q), 
which equals 1 if Q is also complete. 
2.2 Multi-prover interactive proofs 

In this section we define the complexity classes that this work is concerned with: multi-prover 
interactive proof systems (MIP systems) and multi-prover interactive proof systems with entangle- 
ment (MIP* systems). 

Let k(n) be an integer, denoting the number of provers, and m(n) an integer denoting the num- 
ber of rounds. Both k(n) and m(n) are from the set of polynomially bounded, polynomial-time 
computable functions in the input size \x\, denoted by poly. Further, c and s denote polynomial- 
time computable functions of the input size into [0, 1] corresponding to completeness acceptance 
probability and soundness error. For notational convenience in what follows we will omit the 
arguments of these functions. 

Multi-prover interactive proof systems (MIP systems): Let k,m,l £ poly. A fc-prover interac- 
tive proof system consists of a verifier V and k provers Pi, . . . , P^. The verifier is a probabilistic 
polynomial-time Turing machine, and the provers are computationally unbounded. Each of them 
has a read-only input tape and a private work tape. Each prover has a communication tape. The 
verifier has a random tape. The verifier also has k communication tapes, one for each prover, each 
of which is Z bits long. 

The input tape for every party contains the same input string x. The protocol consists of m(\x\ ) 
rounds. In each round, first the verifier runs for a polynomial amount of time, updating the work 
and communication tapes. After that, the content of the zth communication tape is sent to the z'th 
prover for each i = 1, . . . ,k(\x\). Each prover reads this string, updates the content of his own 
work tape, and decides a reply to the verifier. The reply from the zth prover is written in the ith 
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communication tape, and this round completes. After m(|x|) rounds of interaction, the verifier 
produces a special output bit, designating acceptance or rejection. The operations by provers are 
instantaneous and do not have to be even computable; the provers are assumed to be able to 
"compute" any function. 

For simplicity, we assume that each message between the verifier and the provers in each 
round is exactly / bits long for the purpose of a formal definition, but it is not hard to modify the 
definition to incorporate the more general case which does not satisfy this assumption. Formally, 
a strategy for P\, . . . , P k in a fc-prover m-round interactive proof system consists of the length /' E 
N of a work tape, and km mappings fiy. {0,1}' x {0,1}'' -4 {0,1}' x {0,1}'' for 1 < i < k 
and 1 < j < m. Each mapping fu specifies the operation which prover i performs in round j: 
fij(q, w) = (q', w') means that if the message from the verifier in this round is q and the work tape 
contains string w before the operation by the prover, then the message to the verifier in this round 
is q' and the work tape contains string w after the operation. 

Definition 4. Let k,m: N —> N, and let c, s: N —> [0,1] such that c(n) > s(n) for all n E N. A 
language L is in MIP(k,m,c,s) if and only if there exists an m-round polynomial-time verifier V for a 
k-prover interactive proof system such that, for every input x: 

(Completeness) if x EL, there exists a strategy for provers P\, . . . , P k such that the interaction protocol 
ofV with (Pi, . . . , P k ) results in the verifier accepting with probability at least c, 

(Soundness) if x g L,for any strategy for provers P{,..., P' k , the probability that the interaction protocol 
ofV with (Pi, . . . , P k ) results in the verifier accepting is at most s. 

In this formulation, the provers are deterministic, but this is not a limitation because it is well- 
known that the power of the model does not change if we allow the provers to share a random 
source. 

If some of the parameters k, m, c, and s are sets of functions instead of single functions, the 
class is interpreted to be the union over all choices in the sets. For example, 

MIP(4, 1,1, 1-1/ poly) = (J MIP(4,1,1,1-1//). 

/Gpoly 

We denote MIP(poly, poly, 2/3, 1/3) simply by MIP. 

Multi-prover interactive proof systems with entanglement (MIP* systems): First introduced 
in [CHTW04], MIP* systems are defined analogously to MIP systems. The only difference is 
that now the provers are allowed to be quantum, while the verifier (and communication) remains 
bounded in classical probabilistic polynomial-time. This implies that the provers may share an ar- 
bitrary entangled state | Y) among themselves before the protocol starts and that each prover may 
use his part of the entangled state to determine his reply to the verifier. In each round, the provers 
individually receive the messages from the verifier in a message register, perform a quantum oper- 
ation on this register together with their share of the entangled state, measure the message register 
in the computational basis, and send back the outcome to the verifier. 

Formally, an entangled strategy for P\, . . . , P k in a fc-prover ra-round interactive proof system 
with entanglement consists of the length /' E N of a work tape, km quantum channels <E>;y from a 
quantum register of Z + V qubits to itself for 1 < i < k and 1 < j < m, and the initial quantum 
state |Y) of the work tape, which is a kl'-qubit state. Each channel Ojy specifies the operation 
which prover i performs in round /: the first / qubits in the state correspond to the message from 
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and to the verifier, and the last V qubits represent the content of the work tape. After the prover's 
operation, the first I qubits are measured in the computational basis and sent to the verifier. 

Definition 5. A language L is in MIP*(k,m,c,s) if and only if there exists an m-round polynomial-time 
verifier V for k-prover interactive proof systems such that, for every input x: 

(Completeness) ifx G L, there exists an entangled strategy for provers P\, . . . , P k such that the interaction 
protocol ofV with (Pi, . . .,Pk) results in the verifier accepting with probability at least c, 

(Soundness) ifx G" L,for any entangled strategy for provers P[,...,PL the probability that the interaction 
protocol ofV with (Pi, . . . , P k ) results in the verifier accepting is at most s. 

In certain cases, we can simplify part of the definition of entangled strategies. Suppose that 
the verifier interacts with certain prover P, only once; i.e., the verifier is guaranteed to send p the 
empty string (or a fixed string) in rounds other than round j, and is guaranteed to ignore the reply 
from Pj in rounds other than round j. In this case, instead of specifying m quantum channels to 
describe the behavior of P, in the m rounds, we may just specify measurements A q = (A r „) for each 
message q from the verifier, where the outcome of each measurement gives a reply to the verifier. 10 
Since all the interactive proof systems considered in this paper have the property that the verifier 
interacts with each prover only once except for one prover, we use this simplified formulation in 
many places. 

Note that we do not assume any upper bound on the size /' of the work tape used by each 
prover (in particular, we do not assume that /' G poly; the model with this restriction is considered 
in [KM03]). However, we do assume that they only use a finite-dimensional Hilbert space. A more 
general definition is commuting-operator provers, considered by Tsirelson [Tsi80] in the context 
of Bell inequalities and later in [SW08, DLTW08, NPA08, IKPSY08]. Although we expect that our 
results remain valid with minor modifications to the proofs even if dishonest provers are allowed 
to use arbitrary commuting-operator strategies, we have not explored this possibility. 

Symmetry. We will make an important use of symmetry in the protocols that we introduce. It 
will be a useful simplifying assumption in two respects: first it lets one assume that the set of 
measurements used by all provers is the same. Second, and most important, it implies that the 
provers' shared entangled state is also permutation-invariant. 

Definition 6. Let (Pi, . . . , Pjt, |Y)) be a k-prover strategy} 1 We say that this strategy is symmetric, or 
permutation-invariant, if Pi = • • • = P k and |Y) is invariant with respect to any permutation of the 
subsystems corresponding to each prover. 

The following simple lemma (which already appears in [KKMTV11, Lemma 4]) shows that 
one can always assume without loss of generality that if a game has a certain symmetry then there 
is an optimal strategy for the provers which reflects that symmetry. 

Lemma 7. Suppose an MIP* proof system is given such that the protocol treats provers P\,...,P k sym- 
metrically (i.e. the protocol is invariant under permutation of their questions and corresponding inverse- 
permutation of their answers). Then given any strategy P\,...,P k with entangled state |Y) that succeeds 
with probability p, there exists a strategy P[,...,P^ with entangled state |Y') and success probability p 
such that P[ = ■ ■ ■ = P^ and |Y') is permutation-invariant. 

10 Any classical post-processing by the prover can be incorporated as part of the description of his measurement. 

11 We think of P; as an arbitrary representation of the set of all quantum channels applied by prover i throughout the 
protocol. 
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Proof. By appropriately padding with extra qubits, assume that all k registers of |Y) have the 
same dimension. Define strategies P[,...,Pi as follows: the provers share the entangled state 
|Y') = X^-gSfc 1^(1)) <8> • • • <8> \o~(k)) <8) (Y' 7 ), where the register containing \u(i)) is given to prover 
i and (Y 17 ) is obtained from |Y) by permuting its registers according to a. For 1 < i < k prover 
i measures the register containing |c(z')) and behaves as in the strategy P a (i)- By the assumed 
symmetry of the protocol this new strategy has the same success probability p, and |Y') has the 
required symmetry properties. □ 

The following claim states a trivial but useful fact about symmetric one-round strategies. 

Claim 8. Let (Pi, . . . , P^, | Y) ) be a symmetric one-round strategy, and for every i G {1, . . . , k}, {A"} a a 
measurement for the i-th prover in that strategy. Then for every permutation a on {1, . . . ,k}, and every 
(a\, . . 

(Y|Af <g> • • • <g> A^|Y) = (YjA^j <g> • • • ® A^J|Y). 

2.3 NEXP-complete problems 

We will use the following NEXP-complete problem, whose NEXP-completeness was shown by 
Papadimitriou and Yannakakis [PY86]: 

Problem 1: Succinct 3-colorability. 

Instance. An integer n G N in unary and a Boolean circuit C for a function {0,1}" x {0,1}" — > 
{0, 1} which represents the adjacency matrix of a graph on 2" vertices. 

Question. Is the graph represented by C 3-colorable? 

Using the standard technique of arithmetization (see e.g. Proposition 3.1 and Lemma 7.1 of 
Ref. [BFL91]), one can show that the following problem is also NEXP-complete. 

Problem 2: Succinct 3-colorability, arithmetized version. 

Instance. Integers r, n G N in unary and an arithmetic expression 12 for a polynomial / (a, z, b\, br, 
a\, «2)/ where z represents r variables and each of b\, bi represents n variables. 

Yes-promise. If F is a field with more than two elements and a G F \ {0, 1}, then there exists a 
mapping A: {0,1}" —> {0, 1, oc} such that for all z G {0, l} r and all b\, b 2 G {0, 1}", it holds that 

f(ct,z,b 1 ,b2,A(b 1 ),A(b 2 ))=0. (3) 

No-promise. If F is a field with more than two elements and oc G F \ {0, 1}, then for every map- 
ping A : {0, 1}" -> F, there exist z G {0, l} r and b x , b 2 G {0, 1}" such that Eq. (3) is not satisfied. 

We note that the degree of the polynomial / represented by the arithmetic expression can be at 
most the size of the arithmetic expression, and is therefore bounded by the input size. 

2.4 Summation test 

Let F be a finite field of characteristic two. 13 If |F| = 2 k , an encoding scheme of elements in F is 
specified by k and an irreducible polynomial /(f) over F2 of degree k. In particular, if |F| = 2 2 ' 36 , 

12 An arithmetic expression is a rooted tree whose internal nodes represent either addition or multiplication and whose 
leaves represent either variables or an integer constant. The size of an arithmetic expression is the number of nodes 
plus the sum of the number of bits required to represent the integer for each constant node. 

13 The restriction to fields of characteristic two arises from the use of Theorem 43 in Appendix C. 
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then it is known that f(t) = f 2 ' 3 ' + t y + 1 is irreducible over F2, and this specifies an encoding 
scheme for F (see Appendix G.3 of Goldreich [G0IO8]). 14 

Consider the following promise problem, which has both an explicit and an implicit input. 

Problem 3: Summation Test Problem. 

Explicit input. Integers m, d G N in unary, and an encoding scheme of a finite field F of character- 
istic two. 

Implicit input. A mapping h : F m — > F. 

Promise. The given encoding scheme is valid, and the mapping h : F m —> F is a polynomial func- 
tion of degree at most d in each variable. 

Question. Is 

£ h(x) = (in F)? (4) 

xe{0,l}"' 

In a (single-prover) interactive proof system for a problem with an implicit input, the implicit 
input is given to the verifier as an oracle. 15 The following variant of the summation test of Lund, 
Fortnow, Karloff, and Nisan [LFKN92] is a special case of Lemma 3.5 in Ref. [BFL91]. 

Lemma 9 (Summation test [BFL91]). Suppose that |F| > 2dm. Then there exists a single-prover in- 
teractive proof system for the Summation Test Problem with perfect completeness and soundness error 
at most dm/|F|. Moreover, in this interactive proof system, the verifier behaves as follows. First he 
chooses q G F m uniformly at random. Then he interacts with the prover. At the same time, he reads the 
value h(q) from the implicit input. Finally he accepts or rejects depending on q, h(q), and the interaction 
with the prover. 16 

To apply the summation test to Problem 2, we have to consider exponentially many constraints 
instead of one. 

Problem 4: AND Test Problem. 

Explicit input. Integers k, d G N in unary, and an encoding scheme of a finite field F of character- 
istic two. 

Implicit input. A mapping h : F k — » F. 

Promise. The given encoding scheme is valid, and the mapping h : F k -4 F is a polynomial function 
of degree at most d in each variable. 

Question. Is h(i) = (in F) for all i G {0, l} k ? 

The idea for the following corollary is already explained in Section 7.1 of Ref. [BFL91]. We will 
give a proof in Appendix C for the sake of completeness. 

Corollary 10. There exists a polynomial ^:NxN4N for which the following holds. There exists a 
single-prover interactive proof system for the AND Test Problem with perfect completeness and soundness 

14 Alternatively, we can use a deterministic polynomial-time algorithm to find an irreducible polynomial of a specified 
degree over F2 by Shoup [Sho90]. 

15 In Ref. [BFL91], the authors refer to the interactive proof system for the Summation Test Problem as an "interactive 
oracle-protocol," viewing the mapping h as an exponentially long certificate string which is given to the verifier as an 
oracle. However, for our purposes it will be more convenient to treat h as part of the input. 

16 In particular, this implies that the verifier reads only one value h(q) from the implicit input and the position q 6 F m 
to read is chosen uniformly in F m . Together with the soundness guarantee, this in turn implies that if the implicit input 
is J-close to a polynomial function h of degree at most d in each variable and h fails to satisfy the equation (4), then the 
verifier accepts with probability at most S + dm/ |F| no matter what the prover does. 
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error at most 5/8 + q(k,d)/\¥\. Moreover, in this interactive proof system, the verifier behaves as follows. 
First he chooses i G F k uniformly and independently at random. Then he interacts with the prover. At 
the same time, he reads the value h(i) from the implicit input. Finally he accepts or rejects depending on i, 
h(i), and the interaction with the prover. 

3 A proof system for Succinct 3-Colorability 

In this section we prove Theorem 1, assuming the soundness of the multilinearity game (see The- 
orem 11 below), which will be proved in Sections 4 and 5. We first describe a three-pro ver poly- 
round proof system for the NEXP-complete Succinct 3-satisfiability problem, Problem 2, in Sec- 
tion 3.1. In Section 3.2 we show that the protocol has perfect completeness with classical provers, 
and in Section 3.3 we show that it has soundness error at most 1 — 1/ poly with entangled provers. 
Theorem 1 is then obtained by repeating this protocol sequentially. 

3.1 Description of the protocol 

We construct a three-prover poly-round proof system for Problem 2. Our protocol follows that 
of [BFL91] for the Oracle-3-Satisfiability problem very closely. In the protocol or [BFL91], the 
verifier makes three queries to the oracle which answers a Boolean value. Because our problem 
is Succinct 3-Colorability instead of Oracle-3-Satisfiability, the verifier would make two queries to 
the oracle which answers a ternary value. We replace these two queries to the oracle by queries to 
two distinct provers. 

Label the provers as P\, P2, P3. The protocol will be symmetric under any permutation of the 
three provers. Let (r,n,f) be an instance of Problem 2, as described in Section 2.3. Let df be 
the maximum degree of / in any one variable. Let m = r + In and d = 2df. Let < Cq < 1 
be a constant defined later (in Theorem 11), and p be the smallest number of the form p = 2 2 ' 3 ' 
such that p > max{8q(m, d), n 1,/Co+4 }, where q is the polynomial appearing in the statement of 
Corollary 10. Let F be the finite field of size p. As was noted in Section 2.4, an explicit encoding 
scheme for F is known in this case. In the protocol, all arithmetic operations in F are performed 
using this encoding scheme. 

In the protocol, each prover P, is told explicitly to play one of the following two roles: 

• Lookup prover: P, receives an element of F", and responds with an element of F. In this case, 
the interaction between the verifier and P, takes only one round. 

• AND-test prover. Pi acts as the prover in the protocol for the AND test (Corollary 10). In this 
case, the interaction between the verifier and P, takes polynomially many rounds. 

The verifier performs one of the following five tests chosen uniformly at random: 

• Consistency test. He tells each prover to act as a lookup prover. He chooses 16F" uniformly 
at random and sends the same question x to provers Pi,P2, P3. He expects each prover to 
answer with an element of F, and accepts if and only if all the answers are equal. 

• Linearity test. He tells each prover to act as a lookup prover. He chooses i G {l,...,n}, 
x G F n and y, ^ z,- G F\{x, } uniformly at random, and sets y,- = Z; = Xj for every j G 
{1, . . . ,n}\{i}. He sends x to Pi, y to P2, and z to P3. He receives a, b, c G F from these three 
provers, and accepts if and only if 

b — a c — b c — a 

yi X\ Z{ yi Z\ X{ 
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• AND test with P3 as the AND-test prover. He tells provers Pi and P2 to act as lookup provers, 
and P3 to act as an AND-test prover. He chooses a G F \ {0, 1} in some canonical way; e.g. 
seta. = t when F is viewed as F2[t] / (t 2 ' 3 ' +t y + 1). Then, the verifier simulates the interac- 
tive proof system from Corollary 10 with the explicit input (m, d) and prover P3. When the 
verifier in Corollary 10 tries to read the value h(z, b\, b 2 ) in the implicit input, where z£F r 
and b\, b 2 £ F", our verifier simulates this by sending b\ to Pi and bi to P2. Upon obtaining 
answers ^1,02 to his queries from these two provers, he evaluates f(oc, z,bx,b2, a\, 02) and 
uses the result as the value oih{z,b\,b2). 

• AND test with Pi as the AND-test prover. The same as above, with Pi and P3 swapped. 

• AND test with P2 as the AND-test prover. The same as above, with P2 and P3 swapped. 

Note that each prover is asked a question x G F" distributed uniformly at random except when 
he is told to act as an AND-test prover. 

3.2 Completeness 

Let (r,n,f) be a yes-instance of Problem 2. Then there exists a mapping A: {0,1}" — > F such 
that Eq. (3) is satisfied for all z G {0, l} r and all b\,b2 G {0,1}" simultaneously. Let g be the 
unique extension of A to a multilinear function g: F" —> F. Each lookup prover answers g(b) on 
question b G F". Then it is clear that this deterministic strategy is accepted with certainty in the 
consistency test and the linearity test. In the AND test, note that the value oih(z,b\,b2) which the 
verifier uses is given by 

h(z, bi, bz) = f{ot, z, b t , b 2 ,g(bi),g{b 2 )), 

which is a polynomial in z, b\, b 2 of degree at most 2df = d in each variable. Therefore, the promise 
in Corollary 10 is satisfied and the AND-test prover has a strategy which makes the verifier accept 
with certainty. 

3.3 Soundness 

The soundness analysis is divided in two parts. First we analyze the consistency and linearity 
tests, which only involve the questions in F", and show that success in those tests implies the 
following. (We refer the reader to Section 2 for some relevant notation and definitions.) 

Theorem 11. There exist positive universal constants Co < 1, c < 1, and C > 1 such that the following 
holds. Let n > 1 be an integer. Let F be a finite field, and ( |Y), { A a x }) a (symmetric, projective) strategy for 
the provers in the three-player multilinearity game in n variables over F (as defined in Definition 13 below) 
that passes both the consistency and the linearity tests with probability at least 1 — e. Assume furthermore 
that p := |F| > n 4 £~ 1/2 and e < n~ 2/c °. Then there exists a sub-measurement {V s }, indexed by 
multilinear g G ML(F",F), such that 

E,£Tr,((A«-^) 2 ) < Ce c , (5) 

a 

where for every x G F" and a G F we defined V£ := YLg-.g( x )=a V g - 

The proof of Theorem 11 is our main technical contribution, and it is given in Sections 4 and 5. 
Assuming the theorem, we prove that our proof system has soundness error at most 1 — n~ 2/c ° / 5, 
provided n is larger than an absolute constant depending on c, cq, and C. 
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Let (r,n,f) be a no-instance. Toward contradiction, suppose that the provers have a sym- 
metric 17 entangled strategy S whose acceptance probability is at least 1 — e/5, where e = n^ 2 ^ c °. 
Let | Y) E Vi ® V2 ® V3, be the state used in the strategy S. Let (A£) flG p be the projective measure- 
ments used by each of the three provers in the strategy S upon question x G F" when he acts as a 
lookup prover. 

The verifier can be viewed as playing the multilinearity game with probability 2/5 and per- 
forming something else, namely the AND test, with probability 3/5. Therefore, the strategy S 
has winning probability at least 1 — e/2 in the multilinearity test. Because |F| = p > n l/c ° +i = 
n 4 £~ 1/2 , Theorem 11 implies that there exists a sub-measurement { V% } eML(F n F) suc ^" ^ at * n ~ 
equality (5) holds, where p is the reduced state of |Y)(Y| on V\. For every x G F" and fl G F, 
let 

vj = E v*. 

gGML(F",F) 
g(x)=fl 

For < z < 2, let S, be the entangled strategy obtained from S by replacing the measurement 
for the first i provers Pi, ... , P, for question x G F n by V^. 18 Note So = S. 

Let V be the verifier who performs one of the consistency test, the linearity test, and the AND 
test with P3 as the AND-test prover each with probability 1/3. Note that when interacting with V, 
provers Pi and P2 are always told to act as lookup provers. For < i < 2, let p; be the probability 
that the strategy S; is accepted by V. 

By definition, po > 1 — e/3. We prove the following. 

Claim 12. For i = 1,2, it holds that \pi-\ — Pi\ < v / Ce c . 

Proof. The only difference between strategies S,_i and S,- is the measurements used by prover P,. 
We call the message from the verifier to P, as register A, and call everything other than A and the 
private space Vj for prover P, as register B. Register A is classical, but we treat it as a quantum 
register which always contains a state in the computational basis. Let a be the global state before 
prover P, performs his measurement, and a a (resp. cy) be the global state after prover P, performs 
the measurement A x (resp. V) on his share of the state, and then discards the post-measurement 
state. Since the marginal distribution on the question to P; is uniform, the state a has the following 
form: 

a = E xeV n \x) (x\a <8> o^' B r 
where Trjgo- x ' B = a Vi = p is independent of x. We want to bound (1/2) || o~w — °~M where 



a w = Tx v . 


E xeV n\x)(x\ A $ 


5 E \a)(a\ c $ 

flGF 


5 




8%) 




a M = Jr v . 


E xe fn\x) (X\ A $ 


5 E l«)( fl lc S 

flGF 


5(7^ 


®Ib)^ b (v 







and C denotes the register used for prover P/s answers. For x G F n , define isometries U x , V x : T 7 ,- 
B -> Vi <g> 6 <g> C by 

IT* = E A x ® % ® l«>C/ 

flGF 
flGF 



17 Lemma 7 shows that we may assume this holds without loss of generality. 

18 Since V is a sub-measurement, the V x may not sum to identity. In that case we introduce an additional outcome 
"fail", corresponding to the element Id — £ fl V£ . Whenever a prover obtains that outcome he aborts the protocol. 



18 



Then, 



\0W — CTm\\i 



< 



E xeV n\x)(x\ A ® £ l«)(«|c ® {{A% <8> I S )(rp B (A a x ® J B ) - (0^ ® I B )(rp B (y/]^ ® %)) 



«GF 



< E 



XGF" 



£ |«)(fl| c ® ((a; ® I B K iB (A a x ®i B )-{^® I B )ap B {jV« ® I B )) 

«GF 



<2E, eF - /^Tr((A«-v^) 2 p) 



«GF 



<2 /E, €P -^Tr((AS-v^) 2 p) 

V flGF 



< 2VCe^, 

where the third inequality is by Lemma 35, the fourth is by convexity and the last by (5). Therefore, 
we have that | p,_i — fi\ < (1/2) || &w — (7m II i < v / Ce c as claimed. □ 

By the triangle inequality, Claim 12 implies that |po — P2I < 2 V Ce c , and therefore 



P2 > Po - 2> 



e/3 



where the last inequality uses c < 1 and C > 1. 

Note that when the provers using strategy S2 interact with V , both provers Pi and P2 can 
be implemented so that they measure the prior entanglement without looking at their questions. 
Since P3 is the only prover who might measure the prior entanglement after looking at his ques- 
tion, strategy S2 can be implemented using shared randomness alone. 

If Pi and P2 choose different multilinear functions, then the provers pass in the consistency 
test with probability at most n/|F| < 1/6 by the Schwartz-Zippel lemma [Sch80, Zip 79] (see 
Lemma 33 in Appendix A for a statement). In strategy S2, they pass in the consistency test with 
probability at least 1 — 15V Ce c . Therefore, provers Pi and P2 choose the same multilinear function 
with probability at least 1 — 15v / Ce c / (1 — 1/6) = 1 — 18V Ce c . This implies that if an oracle 
chooses a multilinear function in the same way as prover Pi and uses it for the two queries, the 
distribution on their answers will differ by at most 18 V Ce c in statistical distance. Therefore, this 
oracle (which always implements a multilinear function) together with prover P3 is accepted in 
the interactive proof system of Corollary 10 with probability at least 1 — 15V Ce c — 18V Ce c = 
1 33\ , G 7 . 

Because (r, n,f) is a no-instance of Problem 2 and |F| = p > 8q(m, d), the acceptance probabil- 
ity in the interactive proof system of Corollary 10 is less than 3/4. Comparing this with the lower 
bound in the previous paragraph, we obtain 



which implies 



e > 



(1322. c) 1 ^' 

contradicting the definition e = n~ 2 ^ c ° as soon as n is large enough. Since we obtained this contra- 
diction from the assumption that there exists an entangled strategy with acceptance probability at 
least 1 — e/5, we have proved the claimed soundness guarantee against entangled provers. 
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4 The multilinearity game 



In this section we analyze the combination of the consistency test and the linearity test described 
in Section 3 as a stand-alone game played between a referee and r > 3 players, which we call the 
r-player multilinearity game in n variables over F. The game is parametrized by an integer n and a 
finite field F of arbitrary size p = |F| (which is not necessarily a prime), and it is performed with 
r players P\,...,P r treated symmetrically. The referee performs either of the following two tests 
with probability 1/2 each: 

• Consistency test. The referee chooses x G F" uniformly at random and sends the same ques- 
tion x to all players Pi, ... , P r - He expects each player to answer with an element of F, and 
accepts if and only if all the answers are equal. 

• Linearity test. The referee chooses i G {1, . . . ,n\, x G F" and y, ^ z, G F\{x,} uniformly at 
random, and sets y ; - = Z; = X; for every G {l,...,n}\ {i}. He sends x, y, z to three out of 
the r players chosen at random, receives a, b, c G F, and accepts if and only if 

b — a c — b c — a 

Vi x i z i yi z / x i 

We now define explicitly what we mean by a strategy for the players in the multilinearity game. 

Definition 13. A strategy for the players in the r-player multilinearity game in n variables over F 
is given by the following. Finite-dimensional Hilbert spaces V\, . . . ,V r , a state |Y) G V\ ® ■ ■ ■ ® V r , and 
for every i G [r] and x G F" a measurement {(Ai) x } aG j; on V{. It is understood that, upon receiving 
question x\ G F n , player P, measures register P; corresponding to his share of |Y) using the measurement 
{(A,)^} flG F, sending the outcome a back to the verifier as his answer. 

We will say that a strategy is symmetric ifV\ — ■ ■ ■ — V r , {A\) a x = • • • = (A r ) a x for every x and a 
(in which case we will simply call the resulting measurement {A a x }), and |Y) is invariant with respect to 
arbitrary permutation of the registers Pi, ... , P r . 

Finally, a strategy will be called projective if all measurements {(A)x}«eF are projective. 

In case a strategy is symmetric, we will often abuse notation and use the symbol p to denote the 
reduced density of |Y) on any (g) !G s "Pu f° r S Q [r], without specifying explicitly which registers 
are understood: by symmetry only the number of registers matters, and this will always be clear 
in context. 

The main result of this section is the following. We refer to Section 2.1 for definitions of the 
quantities appearing in the theorem, and to Lemma 7 for a proof that the symmetry assumption 
made in the theorem is without loss of generality. 

Theorem 14. There exists universal constants < cq < 1, Q > 1 such that the following holds. Let 
( |Y), {A£} fl ) be a permutation-invariant projective strategy for r > 3 players in the r-player multilinearity 
game in n variables over F with success probability at least 1 — e/2. Assume furthermore that p = |F| > 
M 4 £ -i/2 an ^ £ < n ~ 2/c °. Then there exists a sub-measurement ■[ V% }g G ML(F",F) ' indexed by multilinear 
g : F" — > F, such that 

1. V is consistent with A: inc(V, A) < C e c °, 

2. Tr p {V) > 1-C £ C °. 
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The two items in the conclusion of the theorem intuitively state the following. Suppose that 
one of the players in the multilinearity game was to receive a question x G F", measure his share 
of the entangled state | Y) according to the projective measurement { and answer the outcome 
he obtains (as he would in the original game). Now, suppose further that another player, upon 
receiving the same question x G F", instead of measuring her own share of |Y) according to 
was to perform the measurement {V g , Id — V}, where V = V s (which is independent 
of x\). If she obtains the last outcome then she aborts the experiment. If, however, she obtains an 
outcome g G ML(F",F), then she answers her question x with g(x). Item 1. above states that, 
on average over the choice of x, the probability that both players eventually produce different 
outcomes (conditioned on the second player not aborting) is at most O(e co ). Item 2. guarantees 
that, in the hypothetical scenario we just described, the second player does not abort too often: the 
probability that she obtains the outcome "Id —V" is at most Q £ c °. 

We will show that Theorem 14 implies Theorem 11 in Section 4.2, while Theorem 14 will be 
proved in Section 5. In the following section we prove a weaker version of the multilinearity test, 
the "linearity test", which implies Theorem 14 for n = 1. 



4.1 Preliminary analysis: the linearity test 

Let (|Y), be a symmetric projective strategy for the players in the multilinearity game, as 

defined in Definition 13. The following relations translate the assumption that the players succeed 
in the consistency test with probability 1 — £, and in the linearity test with probability 1 — e. 

E *E Tr p{ A x® A x) > 1-e, (6) 

a 

Vi e [n], E „ E Tr P ® Ai ® A* ) >l-ne>l-V~e, (7) 

Xj -f—X^ -f—X^ fX^j 



a'— a 

x'. — X: 



-i i i 



where all expectations are taken under the uniform distribution over the sets in which their indices 
range, and the last inequality follows from our assumption that n < e~ c °/ 2 < £~ 1//2 . 

The following claim proves the "linearity" part of the multilinearity test, thereby establishing 
the base case for the induction that will be performed in Section 5. It also illustrates some of the key 
techniques, in terms of the manipulation of measurement operators, that will be used throughout 
the paper. (The interested reader may thus wish to gain good familiarity with the proof of the 
claim before moving on to later sections, in which proofs will not always be as detailed.) 

Claim 15. Let i G [n\, and e > Suppose that (|Y),{A£}) is a (symmetric, projective) strategy 

passing the consistency test with probability at least 1 — e, and the linearity test in the i-th direction with 
probability at least 1 — y/e. Then there exists a family of measurements {^L, }^ g ml(ff) °f ar ^V 1 suc ^ 
that 

E*E A x~ E B L 2 = O(Vi). (8) 

a i:l(xi)=a P 

We will often use the notation B a x := Y,t£(xi)=a leaving the dependence on i implicit. We 
note for future use that the bound (8) implies that 

CON(A,B) > l-0( v / i) and CON(B) > l-0( v / e). 

These inequalities can be deduced directly from (8), but they will also be apparent from the proof 
of Claim 15, which we now give. 
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Proof. For any £ G ML(F,F), define 



B 



M AA) A l(Xi) 



Then {B x .} e is a well-defined measurement: each operator is non-negative, and since for fixed 
X; ^ x-, as I ranges over ML(F 2 ,F) both t{%\) and t{x\) independently range over F, they sum to 
YLa{A a x ) 2 = Id since, by assumption, for every x and a the measurement operator A a x is a projector. 
Using the definition of || • \\ p , we can expand 

E *E K- E B L 2 = E,E Tr pK) + ^ E ^KXJ 

« kt{xi)=a p « W 

<(*i)=<'(*i) 

-2E, E Tr^A^J. (9) 

«,£: 

We first lower bound the last term above. Applying Lemma 40 from Appendix B with T x = A a x 

and Z h x = B a x , we get 

E x E Tr^A^J-E* E M A x® O = 0(inc(A) 1/2 ) = O(v^) (10) 

fl / £:£(x;)=fl 

by (6), hence it will suffice to show a lower bound on E x Y^ a ,e-. e(xi)=aTx p (A x <S> B^ .). Using the 
definition of B x ., we have 



E x E Tr^A^J 



= E, E Tr„ (A; ® A t Xi> AV } A i ) 

x,x i ^x i i_t p\ X^ X\,X^i x",X^j x\,X^i> 

a,i:i{xj)=a 

a,tl{xi)=a a' 



< E x x ,, x „ V Jr p (A a ® AT iJ A* , x ' ; AT <; (g, A , ' ) + £ 

— X,Xj^=Xj /_l i>\ X^ X\,X^i x",X^i X\,X^i x\,X^i> 

a,i:i{xj)=a 

<E^ r E ETr P K ®A^A^A^®A^) +e/ 

a,k£(xi)=a a 1 



(11) 



where the first equality simply uses that the A a x , x sum to identity over a', the first inequality 

uses (6) on the last two registers (together with A a x < Id), and the last is by positivity. Let a := 
be the reduced density of |Y) on any 3 of the pro vers, and apply Claim 37 to the POVM {A x } a for 
every x. Eq. (6) implies that this POVM is consistent, hence 



EK^id)p( 2 )(A^id)- P ( 2 ) = o(Vi), 
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where we used that the A a x are projectors. Hence 



Ej. x i+ x h V Y\Tr p (A x ® (A a ', AV ] A a , - AT ' ) ® A, 1 ' ) 

a,£:t(xi)=tt a' 



a,l: l[xi)=a 



(£(Id ® A^ $5 Id) p (Id ® A x \^ ® Id) - p) ) 

a' 



where for the inequality we used that for every x and x; ^ x\ , YLa,e-Mxi)=a A* ® A x „ ' ® A x , x < 
Id, and monotonicity of the trace distance. Combining this last bound with (11), we obtain 

E x £ Tr p (A x ® B X j ) = Ej^^i/ £ Tr p (A£ <g> A^*£. <8> A^.) + 0( v / i) 

a,Z:£{xi)=a a,l:l(xi)=a 

If = x- or x, = x", the last summation above evaluates to 1. Hence the expectation is at least as 
large as the probability that the {A a x } pass the linearity test along the 2-th coordinate, which is at 
least 1 — i/e by (7), hence 

E x £ T* P {A' x ®B e x J >l-0{Ve). 

a,l:l(xi)=a 

Combining this inequality with (10) and using that the first two terms in (9) are at most 1 each 
proves the claim. □ 

4.2 Proof of Theorem 11 

In this section we show how Theorem 11, which is the result we need in order to analyze the 
overall protocol from Section 3, follows from Theorem 14. Theorem 14 is proved in Section 5. 

Proof of Theorem 11. Let { } ?GML( - F „ F j be the sub-measurement guaranteed by Theorem 14. Ex- 
panding 

E,X> p ((A« _ 07^)2) = E ,£(Tr p (K) 2 ) + Tr p {V a x ) -2Tx p (A x ^)) 

a a 

<2-2E x £Tr p (A»y^), (12) 

a 

it will suffice to show that this last expectation is close to 1. By applying Lemma 40 from Ap- 
pendix B with T\ = A a x and Z\ = y/V* we obtain that 



ExETT p {A%y/W)-KxETr p {Al®y/V*)\ = 0(iNC(A) 1/2 ) = 0( v / i) 
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by (6). Hence to upper-bound the right-hand-side of (12) it suffices to lower-bound 

E x £ Tr, (K ®v^)>E x ^ Tr p [A a x ® V«) 

a a 

> 1 - C £ c ° -INC(V,A) 

> 1 - 2C e c °, 

where the second inequality uses item 2 from Theorem 14 and the definition of INC( V, A), and the 
last inequality follows from item 1. Combined with (12), this proves Theorem 11. □ 

5 Soundness analysis of the multilinearity game 

In this section we prove our main result on the analysis of the multilinearity game in the presence 
of entanglement between the provers, Theorem 14. The proof proceeds by induction, and the key 
inductive step is summed up in the following proposition. (We refer to section 2.1 for a definition 
of the quantities that appear in the proposition.) 

Proposition 16. There exists a universal constant < C\ < 1/2 such that the following holds. Suppose 
that (|Y), {A£} fl ) is a symmetric projective strategy for the players in the 3-player multilinearity game in n 
variables over F that is accepted with probability at least 1 — sin both the linearity test and the consistency 
test, for some e > 0. Let p := |F| and 5 > 0, and assume that n~ 8/c -i > 5 > ■ v /ne 1/8 > np~ 1/4 . Let 
1 < k < n — 1 and T be a given family of sub-measurements ofarity k such that INC(T, A) < 3. Then 
there exists a family of sub-measurements V ofarity k + 1 such that 

1. INC(V,A) = 0(e Cl ), 

2. For any family of sub-measurements P ofarity at least k + 1, 

|con(p,v) -CON(P,T)| = 0(5 Cl + INC(P,A) 1/2 ), 

3. For any family of sub-measurements P, of arbitrary arity, 

|C0N(P,V)-C0N(P,T)| = 0(5^ + |C0N(T,T) -Tr p (T)| 1/2 ). 

We first show that Theorem 14 follows from Proposition 16. 

Proof of Theorem 14. Starting from Vq = A, let V\, . . . , V„ be the sequence of measurements of in- 
creasing arity l,...,n given by Proposition 16. By item 1, for every i S [n] we have INC(V„ A) < 
C\t Cl for some universal constant Q. Applying item 2 to P = V, and V = V,, Vj-i, . . . , Vq, an easy 
induction shows that 

\cON(V if Vi) -C0N(Vj,A)| = 0(i (e Cl2 +e Cl/2 )). 

Hence using item 1. and INC(y, A) + CON(y, A) = Tr p (V), since A is a complete family of mea- 
surements, we also get 

|cON(V;vV;0-Trp(V/)| = 0{ie c "), 
where we used C\ < 1/2. Applying item 3 with P = A, an immediate induction then gives 

|C0N(V n ,A) -CON(A,A)| = 0{n^/n£ c ^ n ). 

But CON(A,A) > 1 - e by (6), and using Tr p (V n ) = CON(V„,A) + inc(V„,A) once more the 
theorem is proved for an appropriate choice of the constants Co, Q. □ 
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The proof of Proposition 16 itself proceeds by induction, and is based on two lemmas. The 
first is a quantum analogue of the "self-improvement lemma" [BFL91, Lemma 5.10]. It shows 
that, if a family of sub-measurements {R x>k } is weakly consistent with {A x }, and it passes the 
consistency and linearity tests with high probability, then there exists an "improved" family of 
sub-measurements {T x>k } that are highly consistent with {A a x }. (Item 3 in the conclusion of the 
lemma is not ultimately needed, but is required to combine Lemma 17 with Lemma 18 in the proof 
of Proposition 16.) 

Lemma 17 (Self-improvement lemma). Let (|Y),{.A£} fl ) be a (symmetric, projective) strategy for 3 
players in the multilinearity game, and n~ 8 > 5 > i/ne 1/8 > 1/p such that the following hold: 

1. The strategy ( |Y), {A£} a ) is accepted with probability at least 1 — e/2 in the multilinearity game, 

2. There exists a family of sub-measurements R ofarity k such that INC(K, A) < 5. 

Then there exists a family of sub-measurements T ofarity k, together with, for every x £ F", a family of 
matrices {Sf } , indexed by g E ML(F' :_1 ,F), such that the following hold: 

1. INC(T,A) = 0(e 1/16 ), 

2. For any family of sub-measurements P, of arbitrary arity, |con(P, R) — CON(P, T) | = O(VS), 

3. Foreveryxanda,Z g . g(x<k)=a S s x (S g x ) f < A a x , and for every x> k and g, T%> k = (E x<k S g x ) (E x<)c S|) + 
and 

2 

< s. 



p 



The second lemma is an analogue of the "pasting lemma" [BFL91, Lemma 5.11]. It shows 
how, starting from a family of sub-measurements T of arity k that is consistent with A, one may 
construct a family of sub-measurements V of increased arity k + 1 that is still somewhat consistent 
with A, as expressed in item 1 below. Items 2 and 3 are important to ensure that the new sub- 
measurement V is not "too incomplete", which would render item 1 trivial. 

Lemma 18 (Pasting lemma). There exists a universal constant < ci < 1 such that the following holds. 
Let e,5 > be such that np^ 1 < e < 5 2 . Let (|Y),{A£} fl ) be a (symmetric, projective) strategy for 3 
players that is accepted with probability at least 1 — e/2 in the multilinearity game. Let 1 < k < n — 1 
and T a family of sub-measurements of arity k such that INC(T, A) < 5, and T satisfies item 3. in the 
conclusion of Lemma 17. Then there exists a family of sub-measurements V ofarity k + 1 such that 

1. V is consistent with A: inc(V,A) = 0(S C2 ), 

2. For any family of sub-measurements P ofarity at least k + 1, 

|con(p,v)-con(p,t)| = 0(^ 2 + INC(P,A) 1/2 ), 

3. For any family of sub-measurements P, of arbitrary arity, 

|C0N(P,V) -CON(P,T)| = 0(S C2 + |C0N(T,T) - Tr p (T) | 1/2 ). 
Proposition 16 follows almost immediately by combining the two lemmas. 
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Proof of Proposition 16. Let T be the family of sub-measurements given in the statement of the 
proposition. First apply Lemma 18 to T, obtaining a family of sub-measurements R (called V 
in the lemma) of arity k + 1 such that items 1, 2 and 3 in the conclusion of the lemma hold. Next 
apply Lemma 17 to R, obtaining a family of sub-measurements V of arity k + 1 (called T in the 
lemma) such that items 1 and 2 hold, where given our assumption INC(T, A) < 5 and item 1 from 
Lemma 18 the bound in item 2 is 0(3° 2 ^ 2 ). Item 1 from Lemma 17 implies item 1 in the proposi- 
tion (provided C\ is chosen small enough), and item 2 (resp. item 3) follows from combining item 2 
from Lemma 17 with item 2 (resp. item 3) from Lemma 18. □ 

5.1 The self -improvement lemma 

In this section we prove Lemma 17. Before proceeding with the details, we give some intuition 
and a high-level overview of how we will proceed. 

Consider the following simplified situation in n = 2 dimensions. Although we will eventually 
require p to be a large power of 2, for the purposes of this overview it is sufficient to think about 
the case p = 2, so that the players' answers are simply bits. For every x G F 2 we are given a two- 
outcome projective measurement (A x , A\)\ picture two orthogonal "planes" of dimension d/2 
each, where d is the dimension of either players' private space and can be arbitrarily large. Our 
goal is to find a global "refinement" of these planes: a single measurement {T# }, with outcomes in 
the set of bilinear functions g : F 2 — > F, such that at every x the approximation A a x « £ YLg-.g(x)=a T g 
holds. 19 In order to achieve this, we make two additional assumptions: 

1. There exists another measurement {R 8 } which achieves an approximation of weaker quality, 
up to some 5 3> e, than the one we are looking for, 

2. The {A x } are very close to linear: for every axis-parallel line (x\, •) (resp. (v^)) there is a 
measurement {B^}^ (resp. {B X2 }() with outcomes in the set of linear functions t : F — > F 
such that A a {xi X2) w e E^(* 2 )=« (resp. w e D :£(xi)=fl B^). 

The goal is to use the high quality of the approximation along lines to improve the quality of 
the overall "bilinear" approximation. Let's trust that an ideal measurement {T#}, achieving an 
approximation of order e, exists, and think of {R g } as an adversarially "corrupted" version of 
{T&}. There are two main ways in which {T#} can be corrupted: the first is by applying an 
arbitrary (but not too large) rotation on the whole space. The second is by "mislabeling" some of 
the measurement elements: e.g. for some g, a subspace of the space on which the ideal operator 
T g projects could have been labeled as a subspace of R g for some g' 7^ g. Note that the first type 
of error is unique to the quantum setting, and did not arise in the setting of Babai et al.'s "self- 
improvement" lemma [BFL91]. Indeed, while quantum measurements are subject to arbitrarily 
small perturbations that may add up over time, nothing short of flipping the output of a binary 
function will suffice to corrupt it. 

We devise a procedure which recovers from the first type of perturbation, but not the second. 
This appears unavoidable: if some components of the measurement {R g } are mis-labeled (say by 
completely re-shuffling the part of each measurement element that falls in a small-dimensional 
subspace of the whole space), there is no generic way to recover the corresponding ideal measure- 
ment elements. This is the main reason why the measurements we construct "shrink" at every 
step of the induction, and we have to work with sub-measurements instead: any "mislabeled" 

19 At this point we are being vague as to how the approximation is measured — it will eventually be expressed solely 
in terms of the consistency between the two measurements. 
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portions of space will have to be ignored. Since we cannot recover from such errors, it is crucial 
that they do not add up to too much throughout the whole induction process. 

To correct the first type of error, we introduce the following procedure: 

1. For every x, find the measurement {Sf } g which is closest to {R g } while being perfectly con- 
sistent with {A a x }: that is, Y,g;g( x )=a $x = A a x . This is possible only because the elements Sf 
are allowed to depend on x. We define the {Sf } as the optimum solution to a specific convex 
program (see (13) below). Intuitively, Sf is obtained as the "projection" of R g on the subspace 

2. Show that {Sf } g in fact only depends on x up to some error depending on e only (and not 
5), so that defining T 8 := E x Sf leads to the consistent measurement we are looking for. 

The second step is crucial: why would the {Sf } be (almost) independent of x? Here the linearity 
relations satisfied by the {A a x } come into play. Using the perfect consistency of S and A, together 
with the linearity of A, we are able to conclude that the {Sf } should not vary too much along 
any axis-parallel line. That is, ~ £ ^fxix') ^ or an ^ %l anc ^ Xl ' x 2 ( an d similarly in the other 

direction). This step depends on the specific optimization problem that was introduced in order 
to define {Sf } g (see (13) below). This invariance along axis-parallel lines can then be combined 
with the (reasonably) good expansion properties of the hypercube to conclude that the {Sf } are in 
fact globally invariant, leading to the "corrected" measurement {T^}. (We note that the fact that 
invariance along axis-parallel lines implies global invariance was already used in [BFL91].) 

We proceed with the details. In the following section we introduce the optimization procedure 
that is used to define the operators {Sf } . In Section 5.1.2 we show that the {Sf} are close to 

being independent of x, leading to the definition of the family of sub-measurements {Tf >t }. In 
Section 5.1.3 we show that T satisfies the conclusions of Lemma 17. 



5.1.1 A convex optimization problem 

Let {R x>k } be the family of sub-measurements promised in the assumptions of Lemma 17. Let 

{Sf } g , where x G F" and g G ML(F fc_1 ,F), be an optimal solution to the following convex opti- 
mization problem: 

Convex program for self-improvement 

(13) 

P 



cv := min E X J^ Sf - yKf> t 
g 

Vx,a, X] S£(S£) + <A», 

g--g(x<k)=a 



where R 8 x>k is the positive square root of Ri >k . Let Sf := Sf (Sf ) f . 20 Our first claim shows that 
the optimum of (13) is bounded as a function of the inconsistency of R and A. 

Claim 19. Suppose that the {Rx >k } satisfy the assumptions of Lemma 17. Then the optimum to of (13) 
is at most INC(A,R) +0( v / e). 

20 We will usually use a hat, as in S, to denote matrices which we think of as factorizations of positive semidefinite 
matrices, but are not necessarily positive themselves. In general, the relation between X and X will always be that 

x = xx + . 
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Proof. We construct a feasible solution achieving the claimed value. Let Sf := A^ x<k ^ R x>k . Then 
by definition {Sf } is a feasible solution to (13). To upper-bound its value, we first evaluate 

E,E( Tr p(Sf^) -Tr p (*U) =E^Tr,((Af^-Id)R? > J 

= E^Tr p (A«( £ RlJ) 

a g-g(x)¥=" 

= E x £ Tr p (Rf >t s Af ) + O (iNC ( A) 1 ' 2 ) , 

g 

where the second equality uses that YL a A a x = Id for every x, and the last follows from an applica- 
tion of Lemma 40. A similar calculation shows that 

e, E Tr P (s* (s?) + ) = E * E Tr p ( R L ® A * x<k } ) + ( INC ( A ) 1/2 ) • 



To conclude, expand ||Sf — yjRx >k \\* ar »d use 

E,E Tr p( K L ® A x X<k) ) = Tr p (R) - INC(A,R) 

g 

by definition, together with the bound INC (A) < e from (6). □ 
5.1.2 Constructing a family of sub-measurements independent of 

As a first step in showing that any optimal solution to (13) must be close to one that does not 
depend on x K k, we show that such an optimal solution must be close to another feasible solu- 
tion which is furthermore close to being invariant along the direction of any axis-parallel line in 
direction i < k. Precisely, we have the following. 

Claim 20. Assume p^ 1 < e. For every i < k there exists a feasible solution {zf } ? to (13), with objective 
value at most co + 0(e 1/4 ), such that 

E,EI|2f-E^ ||^ = 0(^). 

c? 

Proof. Let {Sf } be an optimal solution to (13), and for any i < k let 

yg ._ B m*)( E §g\ 

where £, (x) is the line going through x and parallel to the i-th axis, and [B x } e is the "lines" family 
of measurements introduced in Claim 15. We first claim that the Yf^, while not strictly feasible, 
achieve an objective value in (13) of at most co + 0(e 1/4 ). 

Towards proving this, we first show that Bf^-^Sf is close to Sf . Recall the definition of B x = 
Ee-j( Xi )=a B Lr Using the fact that, since {Sf } is feasible, A g x {x - k) S s x = Sf, we get 

E.EII^SI " Slf = ((B*^ - A? ( ^)Sl (flf<**> - A^)) 

g g 

<E X ^\\B X -A x \\ 2 p 

a 

= 0{y/e) (14) 
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by Claim 15. Using the triangle inequality and convexity the following (not necessarily feasible) 
operators 

yg p ug(x<k)fig 



also achieve a value to + 0(y/e) in (13). 

Next we show that the Yf^ are close to the Yf_ ii := B^j^E^Sf . From the definition, 

Yl, = Bf5 w (E x , Sf ) + E,, £ B^Sf. 
The norm of the second term can be expanded as follows: 

2 
P 



**(*i)=*(*<0 

g ^Hg^K^Hgt*^) 

Eq. (29) from Lemma 40 shows that the contribution of all terms such that £ ^ £' is at most 
0(a/inc(B)) = 0(e 1/4 ) by Claim 15. But the only possibility for £ = £' is that also x, = y,-, since 
two distinct linear functions on F intersect in at most one point. Hence we have that 



E^-Elk E = E ^E^ E Tr p (Bl ; SlBl ; ) +0(^) 

g **(*i)=g(*< t ) P g P **(*,■)=*(*<*) 



< - + 0(e 1/4 ). 
Given our assumption on p, this implies 

e^,EII^-^Hp = o( £ 1/4 ), 

and hence the Y| ., while still not necessarily feasible, achieve an objective value in (13) of co + 

0(£ 1/4 ). 

Finally, define Zf := Af B^; w (E*. Sf) . Then the {Z| } are feasible in (13), and the fact that 

E.EPf-YlJl" = O(Vi) (15) 



follows from arguments similar to those used in the proof of Claim 19. Hence the {Zf } are a 
feasible solution to (13) with objective value at most co + O (e 1/4 ) • Finally, by convexity (15) implies 
that 

E x ^\\E x 2 g x -YiJ 2 = 0( v / i), 

g 

which together with the triangle inequality and (15) shows that the {Zf } are close to their expec- 
tation on any axis-parallel line in the z'-th direction, proving the claim. □ 
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Using convexity of X — > |jX — A|| 2 for fixed A, the following follows from Claims 19 and 20. 
Claim 21. Let {Sf } be an optimal solution to (13). TTzen 

E v <,EHSf-E^^II? = °( £l/4 )- 

Proof. We show that the two solutions constructed to (13), { Sf } and { Zf } from Claim 20, must be 
close: 21 

E^EI|Zf-Slf = O^ 4 ). (16) 

g 

The claim then follows by using the triangle inequality to combine this bound with the fact, proved 
in Claim 20, that the Zf themselves are close to their expectation along any axis-parallel line in the 
z'-th direction. Hence it suffices to prove (16). Since the feasible set of (13) is convex, for any 
< t < 1 the elements {(1 — f)Sf + fZf } also constitute a feasible solution. By optimality of {Sf }, 
the resulting objective value must be at least co: for every < t < 1, 



c? P g 

= f 2 E*E 



■Zv Si 



fE,E 

g 

+ 2 1 E, E Tr P ( (Zl " Si) (Sf - + ) • 

Using the known objective values, re-arranging and making t —> 0, we obtain that 



E*ETr,((z£ - Sf) (y/$~- Sf) + ) = O^ 4 ). 



Hence 



e,E 

g 



-zf||J = E ,E(|zf-^L 



sf-^4 

2Tr,((Zf-Sl)(^-Sf) + )) 



= o( e 1 / 4 ), 



□ 



proving (16). 

Claim 21 shows that the {Sf } g do not vary much along any axis-parallel line in the z'-th direc- 
tion. Using the expansion properties of the hypercube, we can deduce that the {Sf } g are close (in 
the squared || • || p norm) to a single operator, independent of the first (k — 1) coordinates. 

Claim 22. For every x>i and g, let Tf >( , := E x<k Sx- Then 



^Lm-n >k \\; = o(ne^). 

g 



Proof. This is a direct consequence of the expansion properties of the hypercube, as expressed in 
Claim 38. □ 



21 Note that Zf implicitly depends on i, and the following equation is measuring the distance on average over the 
k — 1 different constructions of Zf obtained for all 1 < i < k. 
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5.1.3 Proof of Lemma 18 

We conclude the proof of Lemma 18 by showing that the non-negative operators 

T 8 f£ (fS \ f 

1 X> k ■— 1 X> k \ 1 X> k ) r 

where for any x>^ and g the matrix f x>k is defined in Claim 22 in the previous section, satisfy the 
conclusions of the lemma. First note that item 3 follows directly from Claim 22, so it will suffice to 
verify that items 1 and 2 hold. Regarding item 1, we can bound 

INC(T, A) = E x £ Tr p (T g x> _ k <g> A a x ) 

g,a^g(x <k ) 

= E X £ lx p {Sl®A x )+0{^ie m ) 

g,a^g(x <k ) 

< E x E Tr p {A a x ® A x ) + 0( v^£ 1/8 ) 
= 0(V^e 1/8 ), 

where the second equality follows from Cauchy-Schwarz and Claim 22, the inequality follows 
from the fact that the S x are a feasible solution to (13), and the last uses self-consistency of A as 
in (6). 

Item 2 is proved in a similar way. Let P be a family of sub-measurements of arity I, and assume 
that £ < k, the other case being treated symmetrically. By definition, 

|C0N(P,T) - CON(P,R)| = E X £ Tr p( P 4 ® ( T L ~ R D) 

< ( E , E Tr p (P4 ® - - V^L) + ) ) 

f'S-S\x t ... k _ l -f 

• (e, e Tr p (4 ® (% + (% + V^) + ) ) 1/2 

f'S'-g\x(... k _ 1 -f 

1/2 



^^eii^-^iQ' 

= 0(lNC(R,A) 1/2 + v^£ 1/8 ), 



where the first inequality is by Cauchy-Schwarz, the second uses that Y^f Px>i < Id for every x>/, 
and the last follows from the bounds proved in Claim 19 and Claim 22. 

5.2 The pasting lemma 

In this section we prove Lemma 18. Let T be the family of sub-measurements whose existence 
is promised in the lemma's assumptions. For every x, let {S x } h and {T x>k } h be as in item 3 of 
Lemma 17. Let 8 be such that 



maxjlNC^B^lNC^A^E^Elsl- } < s > ( 17 ) 
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where here {B Xk }t are the "lines" measurements in the k-th direction, as defined in Claim 15. Note 
that Claim 15 implies that INC(T,B) < INC(T, A) + 0(e 1/4 ), which justifies including INC(T,B) 
in (17). 

Our goal is to define a new family of sub-measurements V, depending on one less coordinate of 
x than T, but such that V is still consistent with A, and moreover V is not "too small", as measured 
by items 2 and 3 in the lemma. The main idea is to define {V x>k } as (roughly) corresponding to 
the sequential application of {T* } twice, for two random choices of x^. This will produce two 
(k — 1) -multilinear functions h and h' , from which a fc-multilinear function g can be recovered 
by interpolation. This is essentially the same method as was used to define the "line" operators 
B from the "point" operators A in Claim 15. Here the main additional difficulty is that we are 
starting with a family of sub-measurements, instead of complete, projective measurements as was 
the case in Claim 15. 

This section is organized as follows. We start with some preliminary observations in Sec- 
tion 5.2.1. The family of sub-measurements V is defined in Section 5.2.2. Item 1 in the conclusion 
of Lemma 18 is proved in Section 5.2.3, and items 2 and 3 are proved in Section 5.2.4. 



5.2.1 Pre-processing 

In this section we prove a preliminary claim, Claim 23 below, which lets us modify the family of 
sub-measurements T into another family Q that has useful properties. The important property is 
item 3. in the claim, which establishes a form of commutation between Q and the "line" measure- 
ments B. Intuitively, that such a property would hold for Q equal to T should follow from the 
consistency between the families of sub-measurements defined by T and B: consistent measure- 
ments are "compatible", and by the gentle measurement lemma (cf. Lemma 35) the order in which 
they are performed does not matter. However, we could not show directly that item 3 below holds 
for the family of sub-measurements T itself; hence we need to modify it slightly. 

Claim 23. Let T be the family of sub-measurements satisfying the assumptions of Lemma 18, and 3 be as 
in (17). There exists a family of sub-measurements {Q x>k } suc h that the following hold: 

1. Tr p (Q) > Tr p (T) - 0(5 Ci ), 

2. For every x>i an d h, Q x>k = B x>k Q x>k B x>k for some family of sub-measurements {Q x>k } ( an d in 
particular inc(Q, A) = 0(e 1/2 )), 

3. Let Q x>k = E Xk Zh Q h x > k ■ For any r > 1, 



where C4 > is a universal constant. 

Proof. For any x<jt define a "pinching" map 



? ■ T h ,_s. JJ h ( x <k)-r h v h (x<k) 
C X <k ■ L X >k ^ ° X 1 X> k °X 



Note that £ x<k also implicitly depends on h^x^), but this dependence will always be clear from 
the context. Let £(■) := E x<k £ x<k (-). The idea for the definition of Q consists in applying the 
map £ to T a certain number of times, leveraging a certain stability property that will follow after 
sufficiently many applications. 
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Let M be an integer to be fixed later, and for every x>k and h let R x>k '■= ^ M iTx >k )> where £ M 
denotes the sequential composition of £ with itself M times. Using the Schwarz-Zippel lemma 
(Lemma 33) it is not hard to verify that, as long as M > 1, INC(R, B) = 0(lNC(B, B) + n/p) = 
0(e 1/2 ). The proof of Claim 23 is based on the following sequence of facts. 

Fact 24. There is a choice of M < S~ 1/4 for which the following holds: 

E*E Tr p(( R L-W R L)) 2 ) =o(^ 4 ). 

h 

Proof Let ]\ = 5~ l/i . The proof is based on the use of the potential function 

O, := E^^Tr^^-ar^)) 2 )), 

h 

defined for all < i < ]\. Note that <£>, is non-negative, always at most 1, and by the pinching 
inequality (£(X)) 2 < £ (X 2 ) for any positive semidefinite X, <!>, is non-increasing with i. Let i\ the 
smallest index i for which it holds that 

ExETrp^-'^C^- 1 ^)) 2 ) - (£ x< ^ l -HTlj)) 2 )) < ^\ (18) 

h 

Using operator convexity of the square function, this inequality not being satisfied for some i 
implies that — O, > S 1 ^. Since this can happen for at most ^~ 1//4 indices i, an < i\ < S 1 '^ 
such that (18) is satisfied for i = i\ must exist. Using self-consistency of B (/i — i\) times, and 
consistency of T and B, (18) is seen to imply 

ExEtMC^-'CO - e x<k {e^-\T h x>k ))f) < 0(S-^inc(B)^ + 5^). 
h 

To conclude, we set M := i x ~ 1 and use INC(B) = 0(e 1/2 ) < 5 in . □ 

The following is a consequence of Fact 24. 
Fact 25. The following holds 

E*W W E Tr p (Bl t <^<^<^BlJ =0(lNC(R,B) 1 / 2 + ^/ 8 ). 

Proof. By definition of R, 

F / Y"Tr (R g]Xk R 8lVk R g]Xk ) 

^X^Xk^Vk L-l ll P \ 1X X k X >k 1X y k X >k 1 ^X k X >k ) 
g 

p V^Tr T>g(x <k yk) -n$\Vk T>g{x<kyk) T}S\x k \ 

— n x^ k ,x k ^y k lL p\ lK XkX >k D x^ k yk ix ykX >k D x^ k yk 1K x k x >k ) 

g 

-V , VTr ( ( nS(x<k) nS\x k K g{x<k)\r,g(x <k yk)-nS\yk R S(*<«ft) K 8\*k \ , n/M 

— a x^ k ,xk^=yk 2— i xl p\\ D x^kXk lx x kX> k D x^kXk ) D x^kVk lx ykX>k D x^kyk lx x k x >k ) ^~ )> 

g 
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where the second equality follows from Fact 24. Using that, by definition, for x k ^ y k , B^ x k xf^ B^yf^ 

(Tl 

B x *< k and consistency of R and B, we get 

F j. Y"Tr (R g]Xk R glVk R 8lXk ) 

g 

= ^ k , XkHk Y^ P i^Xk^ k ^k\k^ k ^Xk) + 0(INC(K, B) 1/2 + 

g 

= E, W y t T^P^T^kXk^kXk^kXk^T) + 0(INC(R, B) 1/2 + ^ 8 ), 

g 

where the last equality again follows (after a little work) from Fact 24. □ 
We will also use the following. 

Fact 26. Let {S x>k } g be an arbitrary family of sub-measurements and > 0. There exists an ix < }fa l 
such that 

E x ^TrJ(£ x>k (£^-\slj)-£^\Sij) 2 ) = 0(u 2 + u 2 1 INC(B) 1/2 ), 

g 

where here we denote £(S x>k ) = E x<k Bx^ k <k S x>k Bx^ k <k ■ Moreover, for all i > z'2 it holds that 

E,E Tr p((^' +1 (sL)-^(sL)) 2 ) = o{ n + fiNc(B) 1 / 2 ). 

g 

Proof. The proof is very similar to that of Fact 24, and is based on the use of the potential function 

*i := E x £Tr p (£^((£\S x>k )) 2 )), 

c? 

defined for all < i < fa, where fa = }fa} ■ Note that is always at most 1, and by the pinching 
inequality £{X) 2 < £(X 2 ) for any positive semidefinite X, <£>; is non-increasing with i. Let z'2 the 
smallest index such that <3>; 2 _i — 0, 2 < ^2; as long as fa > }fa l such an < z'2 < 72 must exist. By 
definition, it then holds that 

E,ETr P (^- l2 (^((^-HsI > J)V(^(^- 1 (Sf > ,))) 2 )) < n- 

S 

Using self-consistency of B (fa — z'2) times, we obtain 

^Tr p [[£x <k {^- l {Sl >k )) - £ h -\siJ) 2 ) < n + 0{famc{Bf' 2 ). 

g 

To conclude the proof, it suffices to use the operator convexity of the square function to move the 
expectation over x <k inside the square, and then observe that 

E x j:Jr p ((£((£^-\slj)-£ 2 ((£^\Sl >k ))) 2 ) 

g 

<E x ETr,(£:(((^- 1 (SL))-^((^- 1 (Sf > J)) 2 ) 

g 

< E*E^((( fi2_1 ( S D -m^iSU)) 2 ) +0(INC(B) 1 / 2 ), 

g 

again using self-consistency of B. □ 
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Let M' be an integer to be fixed later, and for every and h define 

Q x > k ■- \E x<k B x ) {K x ^ k ){b x<k ti x ) . 

Observe that, as before, as long as M' > 1 it holds that inc(Q,B) = 0(inc(B,B) + n/p) = 
0(e 1/2 ). For any g £ ML(F fc ,F), let Q 8 x>k := E^ojJ^ Q 8 yk % k . The following implies item 3. in 
Claim 23: for any r > 1, 

E x TrJ((Q x>k Y - £ {B^Qiy x ^) r ) 2 ) = 0(r 2 ^) f (19) 

g 

where C3 > is a universal constant. Eq. (19) is proved by induction on r. The case r = 1 is stated 
in the following claim. 

Fact 27. The following holds 

E,Tr,((Q^-X:B^Qf > ,B^) 2 ) = 0(S 1/16 + M'S 1/8 ). (20) 

Proof. Fact 25 implies that {Q x>k } and B are 0(S 1/8 ) -consistent, from which it follows that 

E x J^Jr p ( B x *< k Q x>k B x * k <k B x ^ k <k Q g x>k B x * k <k ) = E x E^Qi^ 4^QL) + 0{5 1 ' 8 ) 
g>g' g>g' 

= E x ETr p (Qi,QC $5 ® flfc* ) + 0(^ 1/16 + M^ 1/8 ) 

g,g' 

= E x Tr p {(Q x> f)+0(S 1/16 + M'S 1/8 ). 

Here the second equality follows by applying Fact 26 with y.2 '■= $ 1/16 and S x chosen as Q x>k to 
move the term B x * k k on the outside, and holds as long as M' > ji^ 1 = <!>~ 1/16 . (The third uses 
consistency of Q and B.) Expanding out the square in (20), all four terms can be related up to 
0(M'3 1/8 ) by using similar arguments. □ 

The induction step required to prove Eq. (19) uses arguments similar to that of the proof of 
Fact 27, and we leave the details to the reader. Once that equation is established, choosing M' = 
£-1/16 jf- em 3 i n claim 23 follows. Items 1 and 2 in the claim are simple consequences of the 
definition of Q from R, and of R from T; again we omit the details. □ 



5.2.2 Construction of the pasted family of sub-measurements 

In this section and for the remainder of the proof of Lemma 18 we rename the family of sub- 
measurements {Q x>k } constructed in the previous section into {T x }. The only properties of 
that family that we will need are those stated in Claim 23. In order to define the pasted sub- 
measurements V, we first introduce a "pseudo-inverse" f as follows. As usual, let T x>k = E Xk T x . 
and tj > a small parameter to be fixed later. Define 

(E( Id -^) ) . (2D 
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where R := (10/?/) log(l///) is chosen so that T x>k (1 — T x>k f x>k ) < //Id (note that, by definition, 
f K>) , commutes with T x>)c ). Expanding out the series in the definition of f x>k , Item 3 from Claim 23 
implies that the following equation holds: 

E,Tr p ((f^ -E B L k f ^ B Lf) = 0{{5/ n r), (22) 

t 

where C5 > is a sufficiently small constant. For every x^ and g G ML(F fc ,F), define 

^*»c - — V p/ yk 1 x > ky Iz x k ^y k J -x k x >k J 1 x >k 1 y k x >k 1 x >k y n -x k ^y k 1 x k x >k J 1 x >k - 

The scaling factor (1 + R/p)^ 1 is necessary to ensure that the {V x>k } sum to at most identity. It 
induces an extra error term in all our estimates; however our choice of tj = 5 C ' for some c' > 
will ensure that this error term is of the same order as ones that already appear; for clarity in the 
remainder of this section we will neglect it. 

Claim 28. The { V x>k } form a family of sub-measurements ofarity k + 1. 

Proof It is clear that V x>k > for every g. When the variable g runs over ML(F fc , F), for Xk ^ yk £ 
F the restrictions g\ Xk and g\ yk independently run over ML(F fc_1 , F) . Hence, using convexity of the 
map A i-> AXA f for any A and X > 0, 

j2 V ^>k-( 1 + ~) 73 I] ll^ X >k T X k X >k T X >k Ty kX>k f X>k T^ kX>k f x>k 

g V V y k ^x k h,h> 

= i 1 + 7) ~v) Y^Y^ Tx >k T x k x >k T x >k T x>k T x>k T^ kX>k f x>k 
y V r x k h 

1 + ~Z ) ~Zp. T x >k T XkX>k f x>k Tx k x >k % >k T XkX>k T x>k 

" " x k h 



£(1 +f)> + %) £M , 



where to obtain the last line we used (J XkX>k ) 2 < T XkX>k as well 

as T x>k < R}^ Id and T x>k T x>k T x>k < 
Id. □ 

5.2.3 Consistency 

In this section we show that the "pasted" sub-measurement V is consistent with A, proving item 1 
of Lemma 18. It will be convenient to introduce the shorthand 

y£» ■■= f x>j h x> jx >k - (23) 

We also let S w := max (J, INC (W, A)). 
Claim 29. The following holds 

INC(W,A) = E x £ Tx p {t x>k T h x J x>k ® A x ) = 0{(5/t]ys), 

h,a^h(x <k ) 

where C5 > is the constant that appears in (22). 
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Proof. We have 

INC(W,A) = E x>k E Tr p ( f x>k T* >k T x>k (Id-<J) 

h 

h,a 

= E^Tr p (B^^<J,>,<^(Id-<J)+0((^)^) 
h 

= E x £Tr p (T x>k R h x>k T x>k (Id -A h x J ® <J + 0((*/j/)*) 
= 0((<5/ f /P), 

where the second equality follows from item 2 in Claim 23 (and some sub-measurement {R x>k }), 
the third follows from (22), the fourth uses Lemma 40 together with consistency of B and A as in 
Claim 15, and the last again follows from self -consistency of A, together with t x>k < K 1/2 Id < 
r\ _1 Id for small enough rj. □ 

Claim 30. The family of sub-measurements V is consistent with A: 

mc{V,A) = 0(4 /2 ). 

Proof. By definition, 

INC(V, A) = E x ^ yk E Tr p (W 8 ^l/ yk % k W 8 J:l k ® A«) 

g,«^g(*<;t) 

= E x x * x „+ v , E Tr p (W?* lf! y x k wV A 8(x< f D ®A X ® A s(x< ^ ] ) + 0{5 1 ' 2 ) 

X ' X k' X k ^Vk I— I V \ x' k X >k VkX>k x'£x >k X^ k x' k ' ^ X ^ X^ k x[ ) V W > 

g,a^g(x<k) 

where the second and third equalities each follow from an application of Lemma 39 and the defi- 
nition of S W/ and the last follows by applying Cauchy-Schwarz and using linearity of A in the fc-th 
direction, as in (7). □ 

5.2.4 Consistency with arbitrary sub-measurements 

We now show that items 2 and 3 in the conclusion of Lemma 18 hold. We will make use of the 
bound 

E x>k Tr p {{ld-W x>k )T x>k {ld-W x>k )) = 0{n), (24) 

which holds by definition of {W* >jt } (cf. (23)) and of % >k (cf. (21)). 
Claim 31. For any family of sub-measurements P ofarity at least k + l, 

|con(P,V) -con(P,T)| = 0(S^ 2 + inc(P,A) 1/2 + n 1/2 ). 
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Proof. Let P be an arbitrary family of sub-measurements of arity £ > k + 1. We prove the claim in 
case £ = k + 1, the other cases being exactly similar. Then P = {Pf^jgeML^F)/ an d by definition 

C0N(P, V) = E x £Tr p (Pf ® V$) 

g 

where the last equality follows from Lemma 39 and the definition of 5 W . We can then write 

= e^^ w E K P (A Hx r' k) pLX {x< i x ' k) ®K x * t w¥' ®A g(j:< f 4) )+o(4 /2 ) 

^3ft ^ P \ *-.it^ >k x^ k x[ ^ x k x>k Vk x >k x'{x >k ^ x^ k x[ I \ W ) 

= E M?L ® ^W*>^<4 ® + °( £l/2 + ^ 2 ), (26) 

&(*<*)=£(*<***) 

where the first equality again uses Lemma 39 and the definition of 8yj, and the last follows from 
an application of the Cauchy-Schwarz inequality and self-consistency of A. In the last expres- 
sion, h and g\ x i are two distinct (A: — l)-linear functions over F: by the Schwartz-Zippel lemma 
(see Lemma 33 for a statement) they intersect in a fraction at most 0(fc/|F|) = 0(k/ p) points. 
Hence, applying the Cauchy-Schwarz inequality to recover a non-negative expression, we can up- 
per bound (26) by 0( yWp + e 1/2 + 8^ 2 ) = 0(S^ 2 ) since 5 W > 5 > np- 1 . Together with (25), 
this shows that 

CON(P, V) = E x , x ^ yi X>, ( P | >t ® W^^W^ is <*<^) + 0(4 /2 ) 

s 

= ^nLMPL ® ® } ) + o(4 /2 + >/ 1/2 ), 

where the second equality follows from the Cauchy-Schwarz inequality and (24). Repeating the 

Six" 

same steps for the remaining term W „ k , and using consistency of P and A to conclude, proves 
the claim. □ 

Claim 32. For any sub-measurement P, of arbitrary arity, 

|cON(P,V)-CON(P,T)| = 0(|C0N(r,T) -Tr p (T)| 1/2 + ^ /2 + // 1/2 ). 
Proof The proof closely follows that of Claim 31, and we omit the details. □ 
This concludes the proof of Lemma 18 provided C2 is chosen to be a sufficiently small constant. 
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A Auxiliary lemmas 



We first recall a key lemma in the analysis of low-degree polynomials over a finite field, the 
Schwartz-Zippel lemma [Sch80, Zip 79], which we state in a form that will be useful to us. 

Lemma 33 (Schwartz-Zippel). Let Fbea finite field, n an integer, and f : F" — > F a non-zero multilinear 
function. Then f has at most sn|F|" _1 zeros. 

The next series of claims are all based on variants of the Cauchy-Schwarz inequality. The first 
follows from Eq. (3) of Bhatia and Davis [BD95] (see also [Bha88]), substituting the norm |||-||| 
by |j • ||i. 

Theorem 34. Let A and B be arbitrary matrices such that the product A f B is well-defined. Then, 

II-A+bHj < ||A|| F ||B|| r 

Winter's gentle measurement lemma [Win99, Lemma 9] (see also Aaronson's "almost as good 
as new" lemma [Aar05, Lemma 2.2]) is a key lemma formalizing the intuitive fact that if a mea- 
surement produces a certain outcome with near-certainty when performed on a specific state, then 
the post-measurement state is close to the original state. The following is a variant of that lemma, 
and we give a proof following Ogawa and Nagaoka [ON07, Appendix C]. 

Lemma 35. Let pbea density operator on a Hilbert space %, and X and Y be linear operators from % to a 
Hilbert space K such that X*X ■< I and Y*Y < I. Then, 



\\XpX* -YpY*^ < 2y Tr(X — Y)p(X — Y)*. 
Proof. By the triangle inequality, 

||XpX* -YpY* || j < \\(X-Y)pX*\\ 1 + \\Yp(X-Y)*\\ 1 . 

By Theorem 34, 

\\(X-Y)pX*\\ 1 < \\( X-Y)^p\\ 2 \\^pX*\ \ 2 

= yjTr(X - Y)p{X - Y)*^TrXpX* 



< JTr(X-Y)p(X-Y)*. 



□ 



Similarly, \\Yp(X -Y)% < \/Tr(X — Y)p(X — Y)*, and the lemma follows. 

We state the following two corollaries of Lemma 35. 
Claim 36. Let {A,} and {£>,} be two sets of positive matrices of the same dimension, and p > 0. Then 

l / , N 1/2 

J^y^ipy^i- ^/Bipy/Bi < 2^Tr(( A /A^- v / ^) p)j . 

Proof. Let X be a block-column matrix with blocks the \/ A}, and similarly for Y and the sjB~{. Then 
^yfAipyfAi-y/BipyfB^ < Y^yfAipyfAi-yfBip^B^ < \\XpX' - YpY f \\ y 

and 

Tr((X-YMX-Y) + ) =^Tr((^A 1 -^B i ) 2 p), 

i 

so that the claim follows from Lemma 35. □ 
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Claim 37. Let a > be a (possibly un-normalized) density matrix on 3 registers, and suppose that a is 
invariant with respect to permutation of the first two registers. Let {A,}, be a POVM on either of the first 
two registers, and let 

5 := £Tr((A f <g> A ] <g> Id)o) . 

¥i 

Then 

||£(V^®Id)Tr 2 ((r) {\fA~i® Id) - Tr^o")^ = 0{Vs) f 

i 

where here \fA\ acts on the first register of a, and the identity on the third. 
Proof. First note that, { A,},- being a POVM, 

Tr 2 (£(ld® y/Ai ® Id) cr (Id ®y/Ai® Id)) = Tr 2 (cr). 



Hence by monotonicity of the trace norm 

|| £ y/Ai <g> Id(Tr 2 ((r)) ® Id -Tr 2 ((r) || 1 

i 

< || ® v/^/ ® Idcry^ (2) </a^ ® Id - J^Id <g> ^/a~ ® Idrfd i 



Aj <8> Id || j 



< || J^V^i® \f~A~i® Ido-^/Xi <g) y^i ® Id - ^ Id® a/A^ <g> Idcrldtg) y^^Id || x 

i i 

+ Y^Tr (Ai ® Aj® Id cr) 

¥i 



< 2^/£Tr(( ^[Ai <g> y[~Ai ® Id -Id ® <g> Id)V) + £ 

< 2v^ + <S 



where the second inequality is the triangle inequality the third is by Claim 36, and for the last we 
expanded 

Y_ Tr ( ( a/A~ <g> a/A~ ® Id -Id ® <8> Id) V) 

= ^ (Tr tg> A, <8> Id o-) + Tr ( Id ® A; ® Id o) - 2Tr ( y 7 ^ ® A; ® Id tr) ) 

j 

< ^ (Tr ( A ; - ® A; ® Id cr) + Tr ( Id ®A; ® Id cr) - 2Tr(A,- ® A, ® Id tr) ) 

i 

where for the inequality \TA~i > A[ follows from < A, < Id for every i, and the last equality uses 
the definition of 8 and Ya A = Id- □ 

The following lemma follows from the standard expansion properties of the hypercube. Recall 
that for p > and any A, \\A\\j = Tr(AA + ,o)- 

Claim 38 (Expansion lemma). Let e > 0, S a finite set of size |S| = p,n,d integers and A : S" — > C dx<i 
swc/z that for every x £ S n ,0 < A x < Id, and 
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where the expectation is taken with respect to the uniform distribution on[n] x S" 1 x S x S. Then 



E T A, 



< 2tt£, 



where both expectations are taken under the uniform distribution over S n . 

Proof. Let M := Y x ,i,x' \ x )i x '\ the adjacency matrix of the hypercube S", L := npld — M the 
Laplacian, and L = L <g> p. Let A = \x) ® A*. Then 

A+I • A = \ E ( A * " ^')V(^ - ArO- (27) 

The normalized Laplacian L/ (np) has smallest eigenvalue 0, and second smallest \\ > 1/ (2n). 
Let the smallest eigenvector of L be \vo) = p~ n/2 Y x \x), and write A = \vq) <g) Ao + ® Ai, 
where |i>i) is orthogonal to \vq), and Ao = p~ n/2 Y, x A x . Then 

A + LA = AiAjpAi > ^-A\pA x . 

Taking the trace and using the assumption made in the claim's statement together with (27), we 
get || Ai \\p < 2nep", and hence by definition of A, 

Tr((A- |w )® A ) + (Id®p)(A- |z7 )®A )) = < 2nep n , 

which proves the claim. □ 



B Lemmas about consistency 

The following useful lemma relates the consistency of a measurement when performed on two 
separate subsystems of a permutation-invariant state with the possibility of exchanging the sub- 
system on which the measurement is performed. Here p is the reduced density of a permutation- 
invariant state. 

Lemma 39. Let k > £ > 1 be two integers, T a family of sub-measurements of arity k, and V a family of 
sub-measurements of arity £. Let {Z x>k } be such that E x Ym Z x>k (Z x ) < Id. Then it holds that 

E x ^Tr p (Z h x>k T h x>k V x>t ) -E x £ ^p( Z L T L ® V L) | < \JlNC(T,V). 
Proof. The proof is a direct consequence of the Cauchy-Schwarz inequality: write 



E x ^Tr p (Z h x> jt_ k ^V x> J-E x £ 

h g,h:h\ 



E * E 

g' h - h \x t x k _^g 



Tr p( Z x> k T x>k 



H x k-i~ 



1/2 



where the last inequality follows from the definition of INC(T, V) and our assumption on Z x . 



□ 
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Lemma 40. Let T be a family of sub-measurements ofarity k, X such that X f X < Id, and {Z x>k } such 
that E x Y^h Z x (Zj ) + < Id (for instance, a family of sub-measurements ofarity I, for any I). Then 11 



E x E Tr p {T h x>k XTl k ® T x>k ) < 2 y INC(T, T) 

Proof. We first prove (28). We have 

E x ^Jr p (Z h x>f T h x>k ® T x>k ) - E x ^Tr p (Z h Xe ® T h x J 
h h 

= E x £Tr p (Z h x jT h x>k ® T x>k - T x>k ® T h x J) 

ft 

< (E^(zi(zi) + )) 1/2 ( E Tr,((T^0 2 )) 

V J 7 ^ft^ft' ' 

< yJlNC(T,T), 

where the second inequality follows from Cauchy-Schwarz. Regarding (29), we have 

ft^ft' /I 

From (28) we know that 



(28) 
(29) 



1/2 



E*E Tr p( T *>;fc XT *>J ® r *>* - E xJ^^ P {T x>k XT x>k ® T^J < y INC (T,T). 
ft ft 

The second term on the left-hand side satisfies 

ft ft 

1/2 



< (E,E Tr p( T ^ x+XT ^ ® r L))^(E,E Tr p(( T ^ - T lf ® T U ) 

ft ft 



1/2 



< ^INC(T,T), 

and this concludes the proof. □ 

C Proof of Corollary 10 

In this section we give the proof of Corollary 10. A standard method to convert multiple con- 
straints to a single constraint involving an exponential sum is by using small-bias probability 
spaces. 

Definition 41 (Small-bias probability space). Let n G N. A set S C F£ is called an e-bias probability 
space if for every c G FJJ \ {0}, if holds that 

I Pr [c • I = 0] - Pr [c ■ I = 1] I < e. 



22 



A special case of interest is when the measurements are complete, in which case the statements simplify. 
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Proposition 42. Let n G N, and Zet S C Fj k an e-foz'as probability space. Let F be a finite field of 
characteristic two. If c G F" \ {0}, f/ten 



Pr 



< 



1 + e 



i=l 



Proof. If F = F2, then the proposition holds because 

= 



Pr 



i=i 




Pr 



i=\ 



For general F, regard F as a vector space over F2, and let {ot\, . . . , a.^} be a basis of F over F2. 

Write case = otyc^ H + ct^c^, where c^, . . . , c^) G F??. Because c ^ 0, we have that c^*) ^ 

for some ;'*. By using the case of F2, it holds that 



Pr 



E<! f) £ 

i=\ 



< 



Since a\,. . . ,a k are linearly independent over 

F 2 , E*=i cili = implies E"=i c^C,- = for all and 
therefore in particular Ya=i c \ %i = 0- Therefore, 











Pr 


E c & = ° 


< Pr 


i=i 









< 



1 + e 



□ 



Theorem 43 (Alon, Goldreich, Hastad, and Peralta [AGHP92]). There exist a constant c > and 
a polynomial-time algorithm C which, given K,M G N, i G {1,...,K} and j G {1, ...,M}, out- 
puts a C(K,M,i,j) G F 2 such that the set {£(/); 1 < < M} darned fcy £ w = (C(K,M,1, ;'),.. 
C(K,M,K,j)) is an (K/ M c )-bias probability space in Fj • 

By arithmetizing the Boolean circuit for C by using a similar idea to the proof of Proposition 4.2 
of Ref. [BFL91], we obtain the following corollary. 

Corollary 44. There exist a constant c > and a polynomial-time algorithm A which, given l k and l m , 
outputs l f and an arithmetic expression f(i,j,l) in k + m + t variables such that the set : j G 
{0,l} m } defined by ^ = (T,ie{o,iy f(i>jf 0)te{o,i} 1 is an 2 k ~ cm -bias probability space in F| . 

Proof of Corollary 10. The protocol works as follows. The verifier first computes m = \(k + 2)/c], 
where c is the constant in Corollary 44. He runs the algorithm of Corollary 44 with parameters k 
and m to obtain t G N and an arithmetic expression f{i,j, I) in k + m + t variables. Let d! be 
the maximum degree of / in single variables. He chooses j G {0, l} m uniformly at random, and 
sends / to the prover. Then he simulates the protocol in Lemma 9 with explicit inputs k + t and d + 
d' and implicit input hj(i,l) := f(i,j,l)h(i). 

For i G W k , j G W m , and I G F f , let = LifihjJ) G F and = (rf y) ) i6{0/ i}* G F 2 \ 
Because m > (k + 2)/c, Corollary 44 guarantees that : / G {0,l} m } is a 1/4-bias probability 
space. 
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Let ci = h(i). Then for all / G {0, l} m , it holds that 



iejo/iJMejo/i}' ie{o,i}* 



(30) 



Completeness: Suppose that c, = for all i G {0, 1} . Then, by Eq. (30), it holds that 

E W) = o 

t€{0,l}*,/€{0,l} t 

for all / G {0, l} m . Therefore, the completeness of the protocol in Lemma 9 implies that the proto- 
col constructed above also has perfect completeness. 

Soundness: Suppose that c ^ 0. By Proposition 42, it holds that 



Pr 



E d y) * = o 



< 



1 + 1/4 



5 
8' 



Eq. (30) and the soundness in Lemma 9 imply that for any; G {0, l} m such that Eie{o,ip tiP c i ^ 0, 
the acceptance probability conditioned on the choice of / is at most (d + d') (k + f)/]F[. Therefore, 
the overall acceptance probability is at most 5/8 + (d + d')(k + £)/|F|. The corollary follows be- 
cause d' and t are polynomially bounded in k. □ 
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